git-annex

Author	SHA1	Message	Date
Joey Hess	4b87669ae2	S3 use last Key when there is no Marker element Fix infinite loop and memory blowup when importing from an unversioned S3 bucket that is large enough to need pagination. I don't think there actually ever will be a Marker element, a delimiter is not set. Probably this code path was never tested with pagination! Also the aws library's lack of any docs made it easy to mess up. Versioned buckets seem to not have the same problem. The API docs for ListObjectVersions say that NextKeyMarker will always be provided when paginating.	2024-11-14 16:12:37 -04:00
Joey Hess	44da423e2e	S3: Send git-annex or other configured User-Agent. --user-agent is the only way to configure it currently (Needs aws-0.24.3)	2024-11-13 16:10:37 -04:00
Joey Hess	3f7953869d	fix	2024-11-13 16:02:55 -04:00
Joey Hess	a16bf4f914	S3: Support versioning=yes with a readonly bucket. Needs aws-0.24.3.	2024-11-12 14:32:23 -04:00
Joey Hess	a4e9057486	implement put data-present parameter in http servant Changed the protocol docs because servant parses "true" and "false" for booleans in query parameters, not "1" and "0". clientPut with datapresent=True is not used by git-annex, and I don't anticipate it being used in git-annex, except for testing. I've tested this by making clientPut be called with datapresent=True and git-annex copy to a remote succeeds once the object file is first manually copied to the remote. That would be a good test for the test suite, but running the http client means exposing it to at least localhost, and would fail if a real http client was already running on that port.	2024-10-29 13:32:43 -04:00
Joey Hess	9db69a4c2c	fix reversion in getting from unchunked encrypted special remotes Have to use the object file for the encrypted key, not the unencrypted key. Bug introduced in `835283b862`	2024-10-28 12:20:10 -04:00
Joey Hess	d9b4bf4224	added retrieveKeyFileInOrder and ORDERED to external special remote protocol I anticipate lots of external special remote programs will neglect implementing this. Still, it's the right thing to do to assume that some of them may write files out of order. Probably most external special remotes will not be used with a proxy. When someone is using one with a proxy, they can always get it fixed to send ORDERED.	2024-10-15 15:40:14 -04:00
Joey Hess	835283b862	stream through proxy when using fileRetriever The problem was that when the proxy requests a key be retrieved to its own temp file, fileRetriever was retriving it to the key's temp location, and then moving it at the end, which broke streaming. So, plumb through the path where the key is being retrieved to.	2024-10-15 14:29:06 -04:00
Joey Hess	75b3f0eb75	fix build with old base i386ancient has a base too old for NE.singleton	2024-09-30 11:02:08 -04:00
Joey Hess	4ca3d1d584	remove read of the heads and one tail Removed head from Utility.PartialPrelude in order to avoid the build warning with recent ghc versions as well.	2024-09-26 18:43:59 -04:00
Joey Hess	10216b44d2	use NonEmpty for dirHashes This avoids 4 uses of head.	2024-09-26 18:15:00 -04:00
Joey Hess	43f31121a5	Git: use NonEmpty in fullconfig This is a nice win. Avoids partial functions, by encoding at the type level the fact that fullconfig is never an empty list.	2024-09-26 17:54:36 -04:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	349b1e443b	proxied importtree=yes remotes are untrustworthy Even without exporttree=yes.	2024-08-08 15:26:02 -04:00
Joey Hess	3ea835c7e8	proxied exporttree=yes versionedexport=yes remotes are not untrusted This removes versionedExport, which was only used by the S3 special remote. Instead, versionedexport=yes is a common way for remotes to indicate that they are versioned.	2024-08-08 15:24:19 -04:00
Joey Hess	5c36177e58	proxied exporttree=yes remotes are untrustworthy This is not perfect because it does not handle versioned special remotes, which should not be untrustworthy, but now are when proxied. The implementation turned out to be easy, because the exporttree field is a default field, so is available in RemoteConfig even for git remotes.	2024-08-08 14:43:53 -04:00
Joey Hess	bb9b02b723	remove unused imports	2024-08-06 14:49:20 -04:00
Joey Hess	a3d96474f2	rename to annexobjects location on unexport This avoids needing to re-upload the file again to get it to the annexobjects location, which git-annex sync was doing when it was preferred content. If the file is not preferred content, sync will drop it from the annexobjects location. If the file has been deleted from the tree, it will remain in the annexobjects location until an unused/dropunused pass is done.	2024-08-04 11:58:07 -04:00
Joey Hess	ee076b68f5	strong verification on retrieval from annexobjects location The file in the annexobjects location may have been renamed from a previously exported file that got deleted in a subsequent export. Or it may be renamed to annexobjects temporarily before being renamed to another name (to handle eg pairwise renames). But, an exported file is not guaranteed to contain the content of the key that the local repository last exported there. Another tree could have been exported from elsewhere in the meantime. So, files in annexobjects do not necessarily have the content of their key. And so have to be strongly verified when retrieving. The same as is done when retrieving exported files.	2024-08-04 11:24:21 -04:00
Joey Hess	069d90eab5	prevent removeKey from annexobjects=yes remote when the key is in the exported tree Removing the key from the annexobjects location when it's in the exported tree would leave it in the exported tree, and so succeeding would update the location log incorrectly. But this also can't remove it from the exported tree, because that would cause import tree to see a file got deleted. So, refuse to remove in this situation. It would be possible to remove from the annexobjects location and then fail. Then if a key somehow got stored in both the annexobjects location and the exported tree location(s), the duplicate would be resolved. Not doing this because first, I don't know how that situation could happen, and second, it seems wrong for a failed remove to have a side-effect like that.	2024-08-02 16:45:52 -04:00
Joey Hess	28b29f63dc	initial support for annexobjects=yes Works but some commands may need changes to support special remotes configured this way.	2024-08-02 14:07:45 -04:00
Joey Hess	6af44b9de6	p2phttp remotes are not readonly That prevented testremote from working when remote.name.url = http://..	2024-07-29 10:54:14 -04:00
Joey Hess	cd89f91aa5	remove uuid from annex+http urls Not needed it turns out.	2024-07-28 20:29:42 -04:00
Joey Hess	bc9cc79e85	set remote's annexUrl automatically When the remote repository's git config file has annex.url set to an annex+http url.	2024-07-28 20:13:41 -04:00
Joey Hess	bdde6d829c	fix http proxying for a local git remote with a relative path git-annex-shell expects an absolute path	2024-07-28 13:35:51 -04:00
Joey Hess	0bdeafc2c4	use annex+http for accessing proxies Doesn't work yet on the http server side, which is throwing 502 bad gateway.	2024-07-25 12:00:57 -04:00
Joey Hess	ba0ecbf47e	less indent	2024-07-25 10:12:59 -04:00
Joey Hess	b13c2407af	p2phttp drop supports checking proof timestamps At this point the p2phttp implementation is fully complete!	2024-07-25 10:11:09 -04:00
Joey Hess	515c42e1e3	testremote passes on p2phttp remote	2024-07-24 14:42:24 -04:00
Joey Hess	97836aafba	Remote.Git lockContent works with annex+http urls	2024-07-24 13:42:57 -04:00
Joey Hess	9fa9678585	Remote.Git removeKey works with annex+http urls Does not yet handle drop proof lock timestamp checking.	2024-07-24 12:33:26 -04:00
Joey Hess	cfdb80cd05	progress meter for p2phttp storeKey	2024-07-24 12:14:56 -04:00
Joey Hess	b3915b88ba	Remote.Git storeKey works with annex+http urls Does not yet update progress meter.	2024-07-24 12:05:10 -04:00
Joey Hess	5b1ac1a313	more generic clientGet	2024-07-24 11:10:19 -04:00
Joey Hess	10f2c23fd7	fix slowloris timeout in hashing resume of download of large file Hash the data that is already present in the file before connecting to the http server.	2024-07-24 11:03:59 -04:00
Joey Hess	7bd616e169	Remote.Git retrieveKeyFile works with annex+http urls This includes a bugfix to serveGet, it hung at the end.	2024-07-24 10:28:44 -04:00
Joey Hess	ad945896c9	avoid needing ifdefs when using P2P.Http.Client	2024-07-24 08:33:59 -04:00
Joey Hess	b0eed55d4f	factor out http server and client into own modules To avoid a cycle when Remote.Git uses the client.	2024-07-23 14:12:38 -04:00
Joey Hess	6bbc4565e6	started wiring p2phttp into Remote.Git but we have a cycle, ugh	2024-07-23 13:53:10 -04:00
Joey Hess	5c39652235	starting support for remote.name.annexUrl set to annex+http In this case, Remote.Git should not use that url for all access to the repository. It will only be used for annex operations, which isn't done yet.	2024-07-23 09:12:21 -04:00
Joey Hess	1243af4a18	toward SafeDropProof expiry checking Added Maybe POSIXTime to SafeDropProof, which gets set when the proof is based on a LockedCopy. If there are several LockedCopies, it uses the closest expiry time. That is not optimal, it may be that the proof expires based on one LockedCopy but another one has not expired. But that seems unlikely to really happen, and anyway the user can just re-run a drop if it fails due to expiry. Pass the SafeDropProof to removeKey, which is responsible for checking it for expiry in situations where that could be a problem. Which really only means in Remote.Git. Made Remote.Git check expiry when dropping from a local remote. Checking expiry when dropping from a P2P remote is not yet implemented. P2P.Protocol.remove has SafeDropProof plumbed through to it for that purpose. Fixing the remaining 2 build warnings should complete this work. Note that the use of a POSIXTime here means that if the clock gets set forward while git-annex is in the middle of a drop, it may say that dropping took too long. That seems ok. Less ok is that if the clock gets turned back a sufficient amount (eg 5 minutes), proof expiry won't be noticed. It might be better to use the Monotonic clock, but that doesn't advance when a laptop is suspended, and while there is the linux Boottime clock, that is not available on other systems. Perhaps a combination of POSIXTime and the Monotonic clock could detect laptop suspension and also detect clock being turned back? There is a potential future flag day where p2pDefaultLockContentRetentionDuration is not assumed, but is probed using the P2P protocol, and peers that don't support it can no longer produce a LockedCopy. Until that happens, when git-annex is communicating with older peers there is a risk of data loss when a ssh connection closes during LOCKCONTENT.	2024-07-04 12:39:06 -04:00
Joey Hess	d2b27ca136	add content retention files This allows lockContentShared to lock content for eg, 10 minutes and if the process then gets terminated before it can unlock, the content will remain locked for that amount of time. The Windows implementation is not yet tested. In P2P.Annex, a duration of 10 minutes is used. This way, when p2pstdio or remotedaemon is serving the P2P protocol, and is asked to LOCKCONTENT, and that process gets killed, the content will not be subject to deletion. This is not a perfect solution to doc/todo/P2P_locking_connection_drop_safety.mdwn yet, but it gets most of the way there, without needing any P2P protocol changes. This is only done in v10 and higher repositories (or on Windows). It might be possible to backport it to v8 or earlier, but it would complicate locking even further, and without a separate lock file, might be hard. I think that by the time this fix reaches a given user, they will probably have been running git-annex 10.x long enough that their v8 repositories will have upgraded to v10 after the 1 year wait. And it's not as if git-annex hasn't already been subject to this problem (though I have not heard of any data loss caused by it) for 6 years already, so waiting another fraction of a year on top of however long it takes this fix to reach users is unlikely to be a problem.	2024-07-03 14:58:39 -04:00
Joey Hess	8b5fc94d50	add optional object file location to storeKey This will be used by the next commit to simplify the proxy.	2024-07-01 10:42:27 -04:00
Joey Hess	f833a28844	Merge branch 'master' into proxy-specialremotes	2024-06-30 11:16:20 -04:00
Joey Hess	3d646703ee	list proxied remotes and cluster gateways in git-annex info Wanted to also list a cluster's nodes when showing info for the cluster, but that's hard because it needs getting the name of the proxying remote, which is some prefix of the cluster's name, but if the names contain dashes there's no good way to know which prefix it is.	2024-06-30 11:14:13 -04:00
Joey Hess	158d7bc933	fix handling of ERROR in response to REMOVE This allows an error message from a proxied special remote to be displayed to the client. In the case where removal from several nodes of a cluster fails, there can be several errors. What to do? I decided to only show the first error to the user. Probably in this case the user is not in a position to do anything about an error message, so best keep it simple. If the problem with the first node is fixed, they'll see the error from the next node.	2024-06-28 14:10:25 -04:00
Joey Hess	a6ea057f6b	fix handling of ERROR in response to CHECKPRESENT That error is now rethrown on the client, so it will be displayed. For example: $ git-annex fsck x --fast --from AMS-dir fsck x (special remote reports: directory /home/joey/tmp/bench2/dir is not accessible) failed No protocol version check is needed. Because in order to talk to a proxied special remote, the client has to be running the upcoming git-annex release. Which has this fix in it.	2024-06-28 13:46:27 -04:00
Joey Hess	c3a785204e	support a P2PConnection that uses TMVars rather than Handles This will allow having an internal thread speaking P2P protocol, which will be needed to support proxying to external special remotes. No serialization is done on the internal P2P protocol of course. When a ByteString is being exchanged, it may or may not be exactly the length indicated by DATA. While that has to be carefully managed for the serialized P2P protocol, here it would require buffering the whole lazy bytestring in memory to check its length when sending, so it's better to do length checks on the receiving side.	2024-06-28 11:22:29 -04:00
Joey Hess	20ef1262df	give proxied cluster nodes a higher cost than the cluster gateway This makes eg git-annex get default to using the cluster rather than an arbitrary node, which is better UI. The actual cost of accessing a proxied node vs using the cluster is basically the same. But using the cluster allows smarter load-balancing to be done on the cluster.	2024-06-27 15:21:03 -04:00
Joey Hess	0ef4183b00	Merge branch 'master' into proxy	2024-06-27 12:41:57 -04:00

1 2 3 4 5 ...

1633 commits