git-annex

Author	SHA1	Message	Date
Joey Hess	9b79f0f43d	use file-io for readFile/writeFile/appendFile on ByteStrings These are all straightforward, and easy small performance wins. Sponsored-by: Nicholas Golder-Manning	2025-01-22 14:30:25 -04:00
Joey Hess	793ddecd4b	use openTempFile from file-io And follow-on changes. Note that relatedTemplate was changed to operate on a RawFilePath, and so when it counts the length, it is now the number of bytes, not the number of code points. This will just make it truncate shorter strings in some cases, the truncation is still unicode aware. When not building with the OsPath flag, toOsPath . fromRawFilePath and fromRawFilePath . fromOsPath do extra conversions back and forth between String and ByteString. That overhead could be avoided, but that's the non-optimised build mode, so didn't bother. Sponsored-by: unqueued	2025-01-22 11:41:43 -04:00
Joey Hess	1ceece3108	RawFilePath conversion of System.Directory By using System.Directory.OsPath, which takes and returns OsString, which is a ShortByteString. So, things like dirContents currently have the overhead of copying that to a ByteString, but that should be less than the overhead of using Strings which often in turn were converted to RawFilePaths. Added Utility.OsString and the OsString build flag. That flag is turned on in the stack.yaml, and will be turned on automatically by cabal when built with new enough libraries. The stack.yaml change is a bit ugly, and that could be reverted for now if it causes any problems. Note that Utility.OsString.toOsString on windows is avoiding only a check of encoding that is documented as being unlikely to fail. I don't think it can fail in git-annex; if it could, git-annex didn't contain such an encoding check before, so at worst that should be a wash.	2025-01-20 19:17:33 -04:00
Joey Hess	5d2aaafa6c	git-remote-annex enableremote to support readonly webdav * Allow enableremote of an existing webdav special remote that has read-only access. * git-remote-annex: Use enableremote rather than initremote.	2025-01-07 15:57:38 -04:00
Joey Hess	a19a3076b5	ssh exit status 255 is a connection problem Previously, when the git config was unable to be read from a ssh remote, it would try to git fetch from it to determine if the remote was otherwise accessible. That was unnessary work, since exit status 255 indicates a connection problem. As well as avoiding the extra work of the fetch, this also improves things when a ssh remote cannot be connected to due to a problem with the git-annex ssh control socket. In that situation, ssh will also exit 255. Before, the git fetch was tried in that situation, and would succeed, since it does not use the git-annex ssh control socket. git-annex would conclude that git-annex-shell was not installed on the remote, which could be wrong. I suppose it also used to be possible for the user to need to enter a ssh password on each connection to the remote. If they entered the wrong password for the git-annex-shell call, but then the right password for the git fetch, it would also incorrectly set annex-ignore, and that situation is also now fixed.	2025-01-03 14:46:16 -04:00
Joey Hess	da5e195597	remove i386ancient and need at least debian stable to build * Removed the i386ancient standalone tarball build for linux, which was increasingly unable to support new git-annex features. * Removed support for building with ghc older than 9.0.2, and with older versions of haskell libraries than are in current Debian stable. * stack.yaml: Update to lts-23.2. Note that i386ancient was targeting linux 2.6.32, which has been EOL for over 9 years now. Any old system still using such a kernel is certainly highly insecure. And I suspect i386ancient had its own insecurities due to haskell libraries and C libraries not having been updated.	2025-01-01 14:15:55 -04:00
Joey Hess	dd052dcba1	annexInsteadOf config Added config `url.<base>.annexInsteadOf` corresponding to git's `url.<base>.pushInsteadOf`, to configure the urls to use for accessing the git-annex repositories on a server without needing to configure remote.name.annexUrl in each repository. While one use case for this would be rewriting urls to use annex+http, I decided not to add any kind of special case for that. So while git-annex p2phttp, when serving multiple repositories, needs an url of eg "annex+http://example.com/git-annex/ for each of them, rewriting an url like "https://example.com/git/foo/bar" with this config set to "https://example.com/git/" will result in eg "annex+http://example.com/git-annex/foo/bar", which p2phttp does not support. That seems better dealt with in either git-annex p2phttp or a http middleware, rather than complicating the config with a special case for annex+http. Anyway, there are other use cases for this that don't involve annex+http.	2024-12-03 14:39:07 -04:00
Joey Hess	51b2d6d8c5	avoid storing same filename repeatedly in versioned import from S3 Logically, this should make it need a lot less memory when files have been changed many times. In my tests, it didn't seem to change memory use at all. Unsure why, it is working. It's possible the Response is not getting garbage collected due to pinning. But as far as I can see, all parts of it that are retained get copied in a way that won't keep the whole thing pinned in memory.	2024-11-15 15:27:42 -04:00
Joey Hess	dc5bf24823	use 80% less memory when importing from a versioned S3 bucket Same idea as commit `eb714c107b`, but even better, because a lot of the response is DeleteMarker, that can be garbage collected now.	2024-11-15 14:19:17 -04:00
Joey Hess	eb714c107b	use 20% less memory when listing unversioned S3 bucket	2024-11-15 13:24:13 -04:00
Joey Hess	4b87669ae2	S3 use last Key when there is no Marker element Fix infinite loop and memory blowup when importing from an unversioned S3 bucket that is large enough to need pagination. I don't think there actually ever will be a Marker element, a delimiter is not set. Probably this code path was never tested with pagination! Also the aws library's lack of any docs made it easy to mess up. Versioned buckets seem to not have the same problem. The API docs for ListObjectVersions say that NextKeyMarker will always be provided when paginating.	2024-11-14 16:12:37 -04:00
Joey Hess	44da423e2e	S3: Send git-annex or other configured User-Agent. --user-agent is the only way to configure it currently (Needs aws-0.24.3)	2024-11-13 16:10:37 -04:00
Joey Hess	3f7953869d	fix	2024-11-13 16:02:55 -04:00
Joey Hess	a16bf4f914	S3: Support versioning=yes with a readonly bucket. Needs aws-0.24.3.	2024-11-12 14:32:23 -04:00
Joey Hess	a4e9057486	implement put data-present parameter in http servant Changed the protocol docs because servant parses "true" and "false" for booleans in query parameters, not "1" and "0". clientPut with datapresent=True is not used by git-annex, and I don't anticipate it being used in git-annex, except for testing. I've tested this by making clientPut be called with datapresent=True and git-annex copy to a remote succeeds once the object file is first manually copied to the remote. That would be a good test for the test suite, but running the http client means exposing it to at least localhost, and would fail if a real http client was already running on that port.	2024-10-29 13:32:43 -04:00
Joey Hess	9db69a4c2c	fix reversion in getting from unchunked encrypted special remotes Have to use the object file for the encrypted key, not the unencrypted key. Bug introduced in `835283b862`	2024-10-28 12:20:10 -04:00
Joey Hess	d9b4bf4224	added retrieveKeyFileInOrder and ORDERED to external special remote protocol I anticipate lots of external special remote programs will neglect implementing this. Still, it's the right thing to do to assume that some of them may write files out of order. Probably most external special remotes will not be used with a proxy. When someone is using one with a proxy, they can always get it fixed to send ORDERED.	2024-10-15 15:40:14 -04:00
Joey Hess	835283b862	stream through proxy when using fileRetriever The problem was that when the proxy requests a key be retrieved to its own temp file, fileRetriever was retriving it to the key's temp location, and then moving it at the end, which broke streaming. So, plumb through the path where the key is being retrieved to.	2024-10-15 14:29:06 -04:00
Joey Hess	75b3f0eb75	fix build with old base i386ancient has a base too old for NE.singleton	2024-09-30 11:02:08 -04:00
Joey Hess	4ca3d1d584	remove read of the heads and one tail Removed head from Utility.PartialPrelude in order to avoid the build warning with recent ghc versions as well.	2024-09-26 18:43:59 -04:00
Joey Hess	10216b44d2	use NonEmpty for dirHashes This avoids 4 uses of head.	2024-09-26 18:15:00 -04:00
Joey Hess	43f31121a5	Git: use NonEmpty in fullconfig This is a nice win. Avoids partial functions, by encoding at the type level the fact that fullconfig is never an empty list.	2024-09-26 17:54:36 -04:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	349b1e443b	proxied importtree=yes remotes are untrustworthy Even without exporttree=yes.	2024-08-08 15:26:02 -04:00
Joey Hess	3ea835c7e8	proxied exporttree=yes versionedexport=yes remotes are not untrusted This removes versionedExport, which was only used by the S3 special remote. Instead, versionedexport=yes is a common way for remotes to indicate that they are versioned.	2024-08-08 15:24:19 -04:00
Joey Hess	5c36177e58	proxied exporttree=yes remotes are untrustworthy This is not perfect because it does not handle versioned special remotes, which should not be untrustworthy, but now are when proxied. The implementation turned out to be easy, because the exporttree field is a default field, so is available in RemoteConfig even for git remotes.	2024-08-08 14:43:53 -04:00
Joey Hess	bb9b02b723	remove unused imports	2024-08-06 14:49:20 -04:00
Joey Hess	a3d96474f2	rename to annexobjects location on unexport This avoids needing to re-upload the file again to get it to the annexobjects location, which git-annex sync was doing when it was preferred content. If the file is not preferred content, sync will drop it from the annexobjects location. If the file has been deleted from the tree, it will remain in the annexobjects location until an unused/dropunused pass is done.	2024-08-04 11:58:07 -04:00
Joey Hess	ee076b68f5	strong verification on retrieval from annexobjects location The file in the annexobjects location may have been renamed from a previously exported file that got deleted in a subsequent export. Or it may be renamed to annexobjects temporarily before being renamed to another name (to handle eg pairwise renames). But, an exported file is not guaranteed to contain the content of the key that the local repository last exported there. Another tree could have been exported from elsewhere in the meantime. So, files in annexobjects do not necessarily have the content of their key. And so have to be strongly verified when retrieving. The same as is done when retrieving exported files.	2024-08-04 11:24:21 -04:00
Joey Hess	069d90eab5	prevent removeKey from annexobjects=yes remote when the key is in the exported tree Removing the key from the annexobjects location when it's in the exported tree would leave it in the exported tree, and so succeeding would update the location log incorrectly. But this also can't remove it from the exported tree, because that would cause import tree to see a file got deleted. So, refuse to remove in this situation. It would be possible to remove from the annexobjects location and then fail. Then if a key somehow got stored in both the annexobjects location and the exported tree location(s), the duplicate would be resolved. Not doing this because first, I don't know how that situation could happen, and second, it seems wrong for a failed remove to have a side-effect like that.	2024-08-02 16:45:52 -04:00
Joey Hess	28b29f63dc	initial support for annexobjects=yes Works but some commands may need changes to support special remotes configured this way.	2024-08-02 14:07:45 -04:00
Joey Hess	6af44b9de6	p2phttp remotes are not readonly That prevented testremote from working when remote.name.url = http://..	2024-07-29 10:54:14 -04:00
Joey Hess	cd89f91aa5	remove uuid from annex+http urls Not needed it turns out.	2024-07-28 20:29:42 -04:00
Joey Hess	bc9cc79e85	set remote's annexUrl automatically When the remote repository's git config file has annex.url set to an annex+http url.	2024-07-28 20:13:41 -04:00
Joey Hess	bdde6d829c	fix http proxying for a local git remote with a relative path git-annex-shell expects an absolute path	2024-07-28 13:35:51 -04:00
Joey Hess	0bdeafc2c4	use annex+http for accessing proxies Doesn't work yet on the http server side, which is throwing 502 bad gateway.	2024-07-25 12:00:57 -04:00
Joey Hess	ba0ecbf47e	less indent	2024-07-25 10:12:59 -04:00
Joey Hess	b13c2407af	p2phttp drop supports checking proof timestamps At this point the p2phttp implementation is fully complete!	2024-07-25 10:11:09 -04:00
Joey Hess	515c42e1e3	testremote passes on p2phttp remote	2024-07-24 14:42:24 -04:00
Joey Hess	97836aafba	Remote.Git lockContent works with annex+http urls	2024-07-24 13:42:57 -04:00
Joey Hess	9fa9678585	Remote.Git removeKey works with annex+http urls Does not yet handle drop proof lock timestamp checking.	2024-07-24 12:33:26 -04:00
Joey Hess	cfdb80cd05	progress meter for p2phttp storeKey	2024-07-24 12:14:56 -04:00
Joey Hess	b3915b88ba	Remote.Git storeKey works with annex+http urls Does not yet update progress meter.	2024-07-24 12:05:10 -04:00
Joey Hess	5b1ac1a313	more generic clientGet	2024-07-24 11:10:19 -04:00
Joey Hess	10f2c23fd7	fix slowloris timeout in hashing resume of download of large file Hash the data that is already present in the file before connecting to the http server.	2024-07-24 11:03:59 -04:00
Joey Hess	7bd616e169	Remote.Git retrieveKeyFile works with annex+http urls This includes a bugfix to serveGet, it hung at the end.	2024-07-24 10:28:44 -04:00
Joey Hess	ad945896c9	avoid needing ifdefs when using P2P.Http.Client	2024-07-24 08:33:59 -04:00
Joey Hess	b0eed55d4f	factor out http server and client into own modules To avoid a cycle when Remote.Git uses the client.	2024-07-23 14:12:38 -04:00
Joey Hess	6bbc4565e6	started wiring p2phttp into Remote.Git but we have a cycle, ugh	2024-07-23 13:53:10 -04:00
Joey Hess	5c39652235	starting support for remote.name.annexUrl set to annex+http In this case, Remote.Git should not use that url for all access to the repository. It will only be used for annex operations, which isn't done yet.	2024-07-23 09:12:21 -04:00

1 2 3 4 5 ...

1643 commits