git-annex

Author	SHA1	Message	Date
Joey Hess	f1c85ac11b	fix breakage in wormhole's sendFile Commit `ff0927bde9` broke this, it made it try to read all of the input before looking for the code. But, wormhole keeps running until it sends the file, so that caused a deadlock. Oops. Sponsored-by: Luke Shumaker on Patreon	2022-09-26 15:26:29 -04:00
Joey Hess	17129fed66	fix wormhole --appid option position p2p: Pass wormhole the --appid option before the receive/send command, as it does not accept that option after the command I'm left wondering, did I get this wrong from the beginning, or did wormhole change its option parser? I'm reminded of the change in 0.8.2 where it silently changed what FD the pairing code was output to. But, looking at the wormhole source, it was at least putting --appid before send in its test suite from the introduction of the option. So I think probably this has always been broken. On 2021-12-31 the --appid option was enabled, and it took until now for someone to try git-annex p2p --pair and notice that flag day broke it.. Sponsored-by: Svenne Krap on Patreon	2022-09-26 15:11:38 -04:00
Joey Hess	1fe9cf7043	deal with ignoreinode config setting Improve handling of directory special remotes with importtree=yes whose ignoreinode setting has been changed. (By either enableremote or by upgrading to commit 3e2f1f73cbc5fc10475745b3c3133267bd1850a7.) When getting a file from such a remote, accept the content that would have been accepted with the previous ignoreinode setting. After a change to ignoreinode, importing a tree from the remote will re-import and generate new content identifiers using the new config. So when ignoreinode has changed to no, the inodes will be learned, and after that point, a change in an inode will be detected as a change. Before re-importing, a change in an inode will be ignored, as it was before the ignoreinode change. This seems acceptble, because the user can re-import immediately if they urgently need to add inodes. And if not, they'll do it sometime, presumably, and the change will take effect then. Sponsored-by: Erik Bjäreholt on Patreon	2022-09-16 14:11:25 -04:00
Joey Hess	8a4cfd4f2d	use getSymbolicLinkStatus not getFileStatus to avoid crash on broken symlink Fix crash importing from a directory special remote that contains a broken symlink. The crash was in listImportableContentsM but some other places in Remote.Directory also seemed like they could have the same problem. Also audited for other places that have such a problem. Not all calls to getFileStatus are bad, in some cases it's better to crash on something unexpected. For example, `git-annex import path` when the path is a broken symlink should crash, the same as when it does not exist. Many of the getFileStatus calls are like that, particularly when they involve .git/annex/objects which should never have a broken symlink in it. Fixed a few other possible cases of the problem. Sponsored-by: Lawrence Brogan on Patreon	2022-09-05 13:46:32 -04:00
Joey Hess	840bd50390	make it easier to use curl for unusual url schemes Use curl when annex.security.allowed-url-schemes includes an url scheme not supported by git-annex internally, as long as annex.security.allowed-ip-addresses is configured to allow using curl. Sponsored-by: Luke Shumaker on Patreon	2022-08-15 12:22:13 -04:00
Joey Hess	23c6e350cb	improve createDirectoryUnder to allow alternate top directories This should not change the behavior of it, unless there are multiple top directories, and then it should behave the same as if there was a single top directory that was actually above the directory to be created. Sponsored-by: Dartmouth College's Datalad project	2022-08-12 12:52:37 -04:00
Joey Hess	5bc70e2da5	When bup split fails, display its stderr It seems worth noting here that I emailed bup's author about bup split being noisy on stderr even with -q in approximately 2011. That never got fixed. Its current repo on github only accepts pull requests, not bug reports. Needing to add such complexity to deal with such a longstanding unfixed issue is not fun. Sponsored-by: Kevin Mueller on Patreon	2022-08-05 13:57:20 -04:00
Joey Hess	cd9fd6e28c	fix case of Win32	2022-08-04 12:17:27 -04:00
Joey Hess	472f5c142b	Use createFile_NoRetry from win32 2.13.3.1 Sponsored-by: Tobias Ammann on Patreon	2022-08-02 10:45:39 -04:00
Joey Hess	ef8e481ebd	clarify comment and remove broken link There are archives of MC Knowledgebase, which google will find, I don't want to try to keep a link to an archive working since MS is no longer providing it.	2022-08-01 13:53:36 -04:00
Joey Hess	d905232842	use ResourcePool for hash-object handles Avoid starting an unncessary number of git hash-object processes when concurrency is enabled. Sponsored-by: Dartmouth College's DANDI project	2022-07-25 17:32:39 -04:00
Joey Hess	2d65c4ff1d	avoid unix-compat's rename On Windows, that does not support long paths https://github.com/jacobstanley/unix-compat/issues/56 Instead, use System.Directory.renamePath, which does support long paths. Sponsored-by: Dartmouth College's Datalad project	2022-07-12 14:55:02 -04:00
Joey Hess	02ef3d6a64	fix build with assistant disabled and webapp enabled The webapp modules cannot build with the assistant disabled, so make the webapp be under the assistant build flag. Sponsored-by: Jarkko Kniivilä on Patreon	2022-06-29 14:19:18 -04:00
Joey Hess	d137bd4c29	fix typo Recent commit got amended accidentially to include this typo. Argh. It was fine, then I tweakd the commit message and accidentially staged this breakage.	2022-06-23 13:54:34 -04:00
Joey Hess	9562da790f	add rent optimisation inhibitor As outlined in my blog post, I object to Microsoft training their Copilot model on my code, and believe it likely violates my copyright. https://joeyh.name/blog/entry/a_bitter_pill_for_Microsoft_Copilot/ While I never push git-annex to Github, other users choose to. And since Microsoft is now selling access to Copilot to anyone, this situation is escalating.	2022-06-23 13:13:54 -04:00
Joey Hess	aad362e1c4	remove vendored library no longer used http-manager-restricted is used for a while, but I forgot to delete this file when making that change.	2022-06-23 12:35:26 -04:00
Joey Hess	debcf86029	use RawFilePath version of rename Some small wins, almost certianly swamped by the system calls, but still worthwhile progress on the RawFilePath conversion. Sponsored-by: Erik Bjäreholt on Patreon	2022-06-22 16:47:34 -04:00
Joey Hess	ebb76f0486	avoid setEnv while testing gpg setEnv is not thread safe and could cause a getEnv by another thread to segfault, or perhaps other had behavior. Sponsored-by: Dartmouth College's Datalad project	2022-05-18 16:05:11 -04:00
Joey Hess	8675b2b075	rename memoryUnits It's not just used for memory sizes.	2022-05-05 15:35:11 -04:00
Joey Hess	d1cce869ed	implement dataUnits finally Added support for "megabit" and related bandwidth units in annex.stalldetection and everywhere else that git-annex parses data units. Note that the short form is "Mbit" not "Mb" because that differs from "MB" only in case, and git-annex parses units case-insensitively. It would be horrible if two different versions of git-annex parsed the same value differently, so I don't think "Mb" can be supported. See comment for bonus sad story from my childhood. Sponsored-by: Nicholas Golder-Manning	2022-05-05 15:25:11 -04:00
Joey Hess	642703c7e4	avoid using removePathForcibly everywhere, it is unsafe If the temp directory can somehow contain a hard link, it changes the mode, which affects all other hard linked files. So, it's too unsafe to use everywhere in git-annex, since hard links are possible in multiple ways and it would be very hard to prove that every place that uses a temp directory cannot possibly put a hard link in it. Added a call to removeDirectoryForCleanup to test_crypto, which will fix the problem that commit `17b20a2450` was intending to fix, with a much smaller hammer. Sponsored-by: Dartmouth College's Datalad project	2022-05-02 14:06:20 -04:00
Joey Hess	17b20a2450	Fix test failure on NFS when cleaning up gpg temp directory Using removePathForcibly avoids concurrent removal problems. The i386ancient build still uses an old version of ghc and directory that do not include removePathForcibly though. Sponsored-by: Dartmouth College's Datalad project	2022-04-19 13:33:33 -04:00
Joey Hess	150d73c268	fix quickcheck test on windows prop_relPathDirToFileAbs_basics (TestableFilePath ":/") failed on windows. The colon was filtered out after trying to make the path relative, which only removed leading path separators. So, ":/" changed to "/" which is not relative. Filtering out the colon before hand avoids this problem. Sponsored-by: Luke Shumaker on Patreon	2022-03-22 13:53:55 -04:00
Joey Hess	982eb7ed0d	remove vendored http-client-restricted Removed vendored copy of http-client-restricted, and removed the HttpClientRestricted build flag that avoided that dependency. http-client-restricted is in Debian stable, and the i386ancient build also uses it, so I think this vendored copy is no longer needed. Sponsored-by: Noam Kremen on Patreon	2022-03-22 11:50:06 -04:00
Joey Hess	5b6518d4a3	fix build warning with old aeson	2022-03-07 13:29:24 -04:00
Joey Hess	cbd138e042	factor out Utility.Aeson.textKey	2022-03-02 18:24:06 -04:00
sternenseemann	ca596e7c54	allow building with aeson >= 2.0 In aeson 2.0, Text has been replaced by the Key type and HashMap by the KeyMap interface. Accomodating this required adding some CPP in order to still be able to compile with aeson < 2.0. The required changes were: * Prevent Key from being re-exported by Utilities.Aeson, as it clashes with git-annex's own Key type. * Fix up convertion from String/Text to Key (or Text in aeson 1.) in a couple of places Import Data.Aeson.KeyMap instead of Data.HashMap.Strict, as they are mostly API-compatible. insertWith needs to be replaced by unionWith, however, as KeyMap lacks the former function.	2022-03-02 18:01:41 -04:00
Joey Hess	952664641a	turn of PackageImports in cabal file This makes it easier to build eg benchmarks of individual modules. May be that most of these PackageImports are not really necessary, dunno.	2022-02-25 13:16:36 -04:00
Joey Hess	61b48b69ba	fix build on windows	2021-12-09 13:39:16 -04:00
Joey Hess	bba74a2b84	improve comments	2021-12-08 18:59:22 -04:00
Joey Hess	ef3ab0769e	close pid lock only once no threads use it This fixes a FD leak when annex.pidlock is set and -J is used. Also, it fixes bugs where the pid lock file got deleted because one thread was done with it, while another thread was still holding it open. The LockPool now has two distinct types of resources, one is per-LockHandle and is used for file Handles, which get closed when the associated LockHandle is closed. The other one is per lock file, and gets closed when no more LockHandles use that lock file, including other shared locks of the same file. That latter kind is used for the pid lock file, so it's opened by the first thread to use a lock, and closed when the last thread closes a lock. In practice, this means that eg git-annex get of several files opens and closes the pidlock file a few times per file. While with -J5 it will open the pidlock file, process a number of files, until all the threads happen to finish together, at which point the pidlock file gets closed, and then that repeats. So in either case, another process still gets a chance to take the pidlock. registerPostRelease has a rather intricate dance, there are fine-grained STM locks, a STM lock of the pidfile itself, and the actual pidlock file on disk that are all resolved in stages by it. Sponsored-by: Dartmouth College's Datalad project	2021-12-06 15:01:39 -04:00
Joey Hess	774c7dab2f	Merge branch 'master' into pidlockfinegrained	2021-12-06 13:00:40 -04:00
Joey Hess	ae4c56b28a	Revert "fix too early close of shared lock file" This reverts commit `66b2536ea0`. I misunderstood commit `ac56a5c2a0` and caused a FD leak when pid locking is not used. A LockHandle contains an action that will close the underlying lock file, and that action is run when it is closed. In the case of a shared lock, the lock file is opened once for each LockHandle, and only the one for the LockHandle that is being closed will be closed.	2021-12-06 12:51:28 -04:00
Joey Hess	e464ffd641	update comment to current status	2021-12-03 18:41:51 -04:00
Joey Hess	e5ca67ea1c	fine-grained locking when annex.pidlock is enabled This locking has been missing from the beginning of annex.pidlock. It used to be possble, when two threads are doing conflicting things, for both to run at the same time despite using locking. Seems likely that nothing actually had a problem, but it was possible, and this eliminates that possible source of failure. Sponsored-by: Dartmouth College's Datalad project	2021-12-03 17:20:21 -04:00
Joey Hess	6988c2e740	fix build on windows broken by `ed0afbc36b` Sponsored-by: Dartmouth College's Datalad project	2021-12-03 14:08:12 -04:00
Joey Hess	ed0afbc36b	avoid concurrent threads trying to take pid lock at same time Seem there are several races that happen when 2 threads run PidLock.tryLock at the same time. One involves checkSaneLock of the side lock file, which may be deleted by another process that is dropping the lock, causing checkSaneLock to fail. And even with the deletion disabled, it can still fail, Probably due to linkToLock failing when a second thread overwrites the lock file. The same can happen when 2 processes do, but then one process just fails to take the lock, which is fine. But with 2 threads, some actions where failing even though the process as a whole had the pid lock held. Utility.LockPool.PidLock already maintains a STM lock, and since it uses LockShared, 2 threads can hold the pidlock at the same time, and when the first thread drops the lock, it will remain held by the second thread, and so the pid lock file should not get deleted until the last thread to hold it drops the lock. Which is the right behavior, and why a LockShared STM lock is used in the first place. The problem is that each time it takes the STM lock, it then also calls PidLock.tryLock. So that was getting called repeatedly and concurrently. Fixed by noticing when the shared lock is already held, and stop calling PidLock.tryLock again, just use the pid lock that already exists then. Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed to take the lock, which was entirely wrong. It should only drop the side lock. Sponsored-by: Dartmouth College's Datalad project	2021-12-01 17:14:39 -04:00
Joey Hess	66b2536ea0	fix too early close of shared lock file This fixes a reversion introduced in commit `ac56a5c2a0`. I didn't notice there that it was handling the case of a shared lock file that was still open elsewhere by not running the close action. This was especially deadly when annex.pidlock is set, as it caused early deletion of the pid lock file. Sponsored-by: Dartmouth College's Datalad project	2021-12-01 17:06:28 -04:00
Joey Hess	a6699be79d	catch error statting pid lock file if it somehow does not exist It ought to exist, since linkToLock has just created it. However, Lustre seems to have a rather probabilisitic view of the contents of a directory, so catching the error if it somehow does not exist and running the same code path that would be ran if linkToLock failed might avoid this fun Lustre failure. Sponsored-by: Dartmouth College's Datalad project	2021-11-29 14:53:07 -04:00
Joey Hess	f3326b8b5a	git-lfs gitlab interoperability fix git-lfs: Fix interoperability with gitlab's implementation of the git-lfs protocol, which requests Content-Encoding chunked. Sponsored-by: Dartmouth College's Datalad project	2021-11-10 13:51:11 -04:00
Joey Hess	8034f2e9bb	factor out IncrementalHasher from IncrementalVerifier	2021-11-09 12:33:22 -04:00
Joey Hess	b2c48fb86b	Fix using lookupkey inside a subdirectory Caused by dirContains ".." "foo" being incorrectly False. Also added a test of dirContains, which includes all the previous bug fixes I could find and some obvious cases. Reversion in version 8.20211011 Sponsored-by: Brett Eisenberg on Patreon	2021-10-26 15:00:45 -04:00
Joey Hess	2b6e287013	remove obsolete libgcc1 last seen in debian oldstable	2021-10-21 03:02:16 -04:00
Joey Hess	d3bea30a6b	fix build failure on windows Utility.QuickCheck also has an instance Arbitrary FileID. It seems that this problem used to be ignored by ghc but now it notices it.	2021-10-20 15:12:12 -04:00
Joey Hess	887edeb1ad	avoid warning when built with unix-compat 0.5.3 It re-exports modificationTimeHiRes, and provides a windows version. Might be worth using that windows version eventually, but I have not tested it.	2021-10-18 16:25:28 -04:00
Joey Hess	1c11dd4793	avoid cursor jitter when updating progress display When the progress display gets longer, and then shorter again, it causes the cursor to jitter back and forth. Somehow I never noticed this until this morning, but then it became intolerable to watch. To fix it, pad the progress display to the maximum length it's occupied. Sponsored-by: Svenne Krap on Patreon	2021-10-07 11:16:41 -04:00
Joey Hess	b2efbd1cd3	fix bug in dirContains dirContains "." ".." was incorrectly true because normalize ("." </> "..") = ".." Sponsored-by: Jochen Bartl on Patreon	2021-10-01 13:53:21 -04:00
Joey Hess	e8959617b6	fix bug in dirContains dirContains ".." "../.." was incorrectly True. This does not seem to be an exploitable security hole, at least as dirContains is used in git-annex. Sponsored-by: Jochen Bartl on Patreon	2021-10-01 13:15:52 -04:00
Joey Hess	6de57642f4	note about coreutils 9.0 supporting CoW by default	2021-09-30 14:12:58 -04:00
Joey Hess	b9aa2ce8d1	resume properly when copying a file to/from a local git remote is interrupted (take 2) This method avoids breaking test_readonly. Just check if the dest file exists, and avoid CoW probing when it does, so when CoW probing fails, it can resume where the previous non-CoW copy left off. If CoW has been probed already to work, delete the dest file since a CoW copy will presumably work. It seems like it would be almost as good to just skip CoW copying in this case too, but consider that the dest file might have started to be copied from some other remote, not using CoW, but CoW has been probed to work to copy from the current place. Sponsored-by: Dartmouth College's Datalad project	2021-09-27 16:03:01 -04:00
Joey Hess	7ccf642863	revert change that broke test_readonly commit `63d508e885` broke test_readonly. When a local git remote is readonly, tryCopyCoW run to copy a file from it failed at withOtherTmp. Sponsored-by: Dartmouth College's Datalad project	2021-09-27 16:02:41 -04:00
Joey Hess	64cac1a721	avoid potentially very long bwlimit delay at start I first saw this getting with -J2 over ssh, but later saw it also without the -J2. It was resuming, and the calulated unboundDelay was many minutes. The first update of the meter jumped to some large value, because of the resuming, and so it thought the BW was super fast. Avoid by waiting until the second meter update. Might be a good idea to also guard for the delay being many seconds and avoid waiting. But how many? If BW is legitimately super fast, and a remote happens to read more than a 32kb or so chunk at a time, it could in theory download megabytes or gigabytes of data before the first meter update. It would actually be appropriate then to delay for a long time, if the desired BW was low. Could make up some numbers that are sane now, but tech may improve. (BTW, pleased to see bwlimit does work with -J. I had worried that it might not, if the meter update happened in a different thread than the downloading, but it's done in the same thread.) Sponsored-by: Brett Eisenberg on Patreon	2021-09-22 19:23:30 -04:00
Joey Hess	e8496d62e4	improved bwrate limiting implementation New method is much better. Avoids unrestrained transfer at the beginning (except for the first block. Keeps right at or a few kb/s below the configured limit, with very little varation in the actual reported bandwidth. Removed the /s part of the config as it's not needed. Ready to merge. Sponsored-by: Luke Shumaker on Patreon	2021-09-22 15:27:16 -04:00
Joey Hess	05a097cde8	Merge branch 'master' into bwlimit	2021-09-22 10:48:27 -04:00
Joey Hess	63d508e885	resume properly when copying a file to/from a local git remote is interrupted Probably this fixes a reversion, but I don't know what version broke it. This does use withOtherTmp for a temp file that could be quite large. Though albeit a reflink copy that will not actually take up any space as long as the file it was copied from still exists. So if the copy cow succeeds but git-annex is interrupted just before that temp file gets renamed into the usual .git/annex/tmp/ location, there is a risk that the other temp directory ends up cluttered with a larger temp file than later. It will eventually be cleaned up, and the changes of this being a problem are small, so this seems like an acceptable thing to do. Sponsored-by: Shae Erisson on Patreon	2021-09-21 17:43:35 -04:00
Joey Hess	18e00500ce	bwlimit Added annex.bwlimit and remote.name.annex-bwlimit config that works for git remotes and many but not all special remotes. This nearly works, at least for a git remote on the same disk. With it set to 100kb/1s, the meter displays an actual bandwidth of 128 kb/s, with occasional spikes to 160 kb/s. So it needs to delay just a bit longer... I'm unsure why. However, at the beginning a lot of data flows before it determines the right bandwidth limit. A granularity of less than 1s would probably improve that. And, I don't know yet if it makes sense to have it be 100ks/1s rather than 100kb/s. Is there a situation where the user would want a larger granularity? Does granulatity need to be configurable at all? I only used that format for the config really in order to reuse an existing parser. This can't support for external special remotes, or for ones that themselves shell out to an external command. (Well, it could, but it would involve pausing and resuming the child process tree, which seems very hard to implement and very strange besides.) There could also be some built-in special remotes that it still doesn't work for, due to them not having a progress meter whose displays blocks the bandwidth using thread. But I don't think there are actually any that run a separate thread for downloads than the thread that displays the progress meter. Sponsored-by: Graham Spencer on Patreon	2021-09-21 16:58:10 -04:00
Joey Hess	9595a247ae	fix test suite failure on windows This was maybe a real bug too, although I don't know what circumstances it would be a problem. See comment for analysis of this windows drive letter wackyness issue. Sponsored-by: Brock Spratlen on Patreon	2021-09-01 11:32:25 -04:00
Joey Hess	e853ef3095	decorate openTempFile errors with the template name This is to track down what file in .git/annex/ is being written to via a temp file when the repository is read-only. Sponsored-by: Dartmouth College's Datalad project	2021-08-30 13:05:02 -04:00
Joey Hess	e17342b2a0	Run cp -a with --no-preserve=xattr, to avoid problems with copied xattrs Including them breaking permissions setting on some NFS servers. Sponsored-by: Dartmouth College's Datalad project	2021-08-27 13:09:34 -04:00
Joey Hess	d154e7022e	incremental verification for web special remote Except when configuration makes curl be used. It did not seem worth trying to tail the file when curl is downloading. But when an interrupted download is resumed, it does not read the whole existing file to hash it. Same reason discussed in commit 7eb3742e4b76d1d7a487c2c53bf25cda4ee5df43; that could take a long time with no progress being displayed. And also there's an open http request, which needs to be consumed; taking a long time to hash the file might cause it to time out. Also in passing implemented it for git and external special remotes when downloading from the web. Several others like S3 are within striking distance now as well. Sponsored-by: Dartmouth College's DANDI project	2021-08-18 15:02:22 -04:00
Joey Hess	88b63a43fa	distinguish between incremental verification failing and not being done Sponsored-by: Dartmouth College's DANDI project	2021-08-18 14:38:02 -04:00
Joey Hess	449851225a	refactor IncrementalVerifier moved to Utility.Hash, which will let Utility.Url use it later. It's perhaps not really specific to hashing, but making a separate module just for the data type seemed unncessary. Sponsored-by: Dartmouth College's DANDI project	2021-08-18 13:19:02 -04:00
Joey Hess	57b5ec79e7	remove comment This comment used to be in Crypto, where it made sense, but it does not really make any sense in Utility.Hash	2021-08-18 13:02:02 -04:00
Joey Hess	fa62c98910	simplify and speed up Utility.FileSystemEncoding This eliminates the distinction between decodeBS and decodeBS', encodeBS and encodeBS', etc. The old implementation truncated at NUL, and the primed versions had to do extra work to avoid that problem. The new implementation does not truncate at NUL, and is also a lot faster. (Benchmarked at 2x faster for decodeBS and 3x for encodeBS; more for the primed versions.) Note that filepath-bytestring 1.4.2.1.8 contains the same optimisation, and upgrading to it will speed up to/fromRawFilePath. AFAIK, nothing relied on the old behavior of truncating at NUL. Some code used the faster versions in places where I was sure there would not be a NUL. So this change is unlikely to break anything. Also, moved s2w8 and w82s out of the module, as they do not involve filesystem encoding really. Sponsored-by: Shae Erisson on Patreon	2021-08-11 12:13:31 -04:00
Joey Hess	a38b724bfa	remove unused function	2021-08-10 20:04:17 -04:00
Joey Hess	86bd9ac186	fix missing new lines in processTranscript	2021-08-02 13:42:27 -04:00
Joey Hess	66089e97de	Fix a rounding bug in display of data sizes Eg, showImprecise 1 1.99 returned "1.1" rather than "2". The 9 rounded upward to 10, and that was wrongly used as the decimal, rather than carrying the 1. Sponsored-by: Jack Hill on Patreon	2021-07-30 09:56:04 -04:00
Joey Hess	b9db859221	addurl: Avoid crashing when used on beegfs. Sponsored-by: Dartmouth College's DANDI project	2021-07-05 13:02:40 -04:00
Joey Hess	9905ec19a7	add pointer to annex.security.allowed-url-schemes Sponsored-by: Kevin Mueller on Patreon	2021-07-02 10:53:45 -04:00
Joey Hess	7b6deb1109	display scanning message whenever reconcileStaged has enough files to chew on Clear visible progress bar first. Removed showSideActionAfter because it can't be used in reconcileStaged (import loop). Instead, it counts the number of files it processes and displays it after it's seen a sufficient to know it's taking a while. Sponsored-by: Dartmouth College's Datalad project	2021-06-08 12:48:30 -04:00
Joey Hess	0434674c85	avoid displaying the scanning annexed files message when repo is not large Avoids users thinking this scan is a big deal, when it's not in the majority of repos. showSideActionAfter has some ugly caveats, since it has to display in the background of another action. I could not see a better way to do it and it works fine in this particular case. It also doesn't really belong in Annex.Concurrent, but cannot go in Messages due to an import loop. Sponsored-by: Dartmouth College's Datalad project	2021-06-04 13:16:48 -04:00
Joey Hess	4de3351c5c	set cwd rarher than changing current process directory This is not actually used in git-annex though.	2021-05-12 17:44:22 -04:00
Joey Hess	947d2a10bc	assistant: Fix a crash on startup by avoiding using forkProcess ghc 8.8.4 seems to have changed something that broke code that has been successfully using forkProcess since 2012. Likely a change to GC internals. Since forkProcess has never had clear documentation about how to use it safely, avoid using it at all. Instead, when git-annex needs to daemonize itself, re-run the git-annex command, in a new process group and session. This commit was sponsored by Luke Shumaker on Patreon.	2021-05-12 15:08:03 -04:00
Joey Hess	4bf7940d6b	fileRef: make paths relative and simplified Fix behavior of several commands, including reinject, addurl, and rmurl when given an absolute path to an unlocked file, or a relative path that leaves and re-enters the repository. To avoid slowing down all the cases where the paths are already ok with an unncessary call to getCurrentDirectory, put in an optimisation in relPathCwdToFile. That will probably also speed up other parts of git-annex by some small amount, but I have not benchmarked. Note that I did not convert branchFileRef, because it seems likely that it will be used with a file that is not provided by the user, so is already in a sane format. This is certainly true for the way git-annex uses it, though maybe arguable to the extent Git.Ref is a reusable library.	2021-05-07 13:25:59 -04:00
Joey Hess	73f330a62e	document an important property of relPathCwdToFile	2021-05-07 12:57:54 -04:00
Joey Hess	6136006106	semigroup and monoid instances for DebugSelector mempty is NoDebugSelector, so it does not default to matching everything, or nothing, in a chain like foo <> mempty	2021-04-06 15:12:35 -04:00
Joey Hess	aaba83795b	switch from hslogger to purpose-built Utility.Debug This uses a DebugSelector, rather than debug levels, which will allow for a later option like --debug-from=Process to only see debuging about running processes. The module name that contains the thing being debugged is used as the DebugSelector (in most cases; does not need to be a hard and fast rule). Debug calls were changed to add that. hslogger did not display that first parameter to debugM, but the DebugSelector does get displayed. Also fastDebug will allow doing debugging in places that are used in tight loops, with the DebugSelector coming from the Annex Reader essentially for free. Not done yet.	2021-04-05 13:40:31 -04:00
Joey Hess	537f9d9a11	Improved display of errors when accessing a git http remote fails. New error message: Remote foo not usable by git-annex; setting annex-ignore http://localhost/foo/config download failed: Configuration of annex.security.allowed-ip-addresses does not allow accessing address ::1 If git config parse fails, or the git config file is not available at the url, a better error message for that is also shown. This commit was sponsored by Mark Reidenbach on Patreon.	2021-03-24 14:19:32 -04:00
Joey Hess	62e152f210	incremental checksum on download from ssh or p2p Checksum as content is received from a remote git-annex repository, rather than doing it in a second pass. Not tested at all yet, but I imagine it will work! Not implemented for any special remotes, and also not implemented for copies from local remotes. It may be that, for local remotes, it will suffice to use rsync, rely on its checksumming, and simply return Verified. (It would still make a checksumming pass when cp is used for COW, I guess.)	2021-02-09 17:03:27 -04:00
Joey Hess	ed684f651e	add incremental hashing interface to Backend As yet unused. Backend.External could perhaps implement it too, although that would involve sending chunks of data to it via a pipe or something, so likely to be slow.	2021-02-09 15:00:51 -04:00
Joey Hess	97129388d5	support fuzzy matching of addon commands Note this does find things in PATH that are not executable. Like searchPath use, the executable bit is not checked. Thing is, there does not seem to be a binding for access(), which would be the right way to check that the right execute bit is set. Anyway, if it's in PATH and it's a file, it's probably fine to treat it as something that was intended to be executable. This commit was sponsored by Brock Spratlen on Patreon.	2021-02-02 19:37:09 -04:00
Joey Hess	1b63132ca3	add searchPathContents And rename related functions for consistency.	2021-02-02 19:06:15 -04:00
James Cook	6013abe87a	fix build on openbsd	2021-02-01 11:53:31 -04:00
Joey Hess	c35fa6975b	fix handling of implicit and before parens Fix an oddity in matching options and preferred content expressions such as "foo (bar or baz)", which was incorrectly handled as if it were "(foo or bar) and baz)" rather than the intended "foo and (bar or baz)" Seemed like a change to consume should be able to handle this case better, but I was having trouble writing it that way, so instead added a separate pass that inserts the implicit ands explicitly. Also added several test cases to make sure versions with and without explicit ands generate the same.	2021-01-28 13:51:07 -04:00
Joey Hess	9b2084f29a	fix problem on windows with newly rewritten prop_relPathDirToFileAbs_basics Seems that dropDrive on windows only drops eg c:/ but not a leading / while on linux, it does drop a leading / (which is what it considers to be equivilant to a drive letter. I had been relying on it to drop both. So need to drop leading directory separators. Also, if the quickcheck generated input is eg "c:c:c:c:foo", dropDrive will only drop the first one, leaving a path that's still not relative. So instead of using dropDrive, just remove the colons from the path.	2021-01-22 14:30:48 -04:00
Joey Hess	ba109ce7df	comment typo	2021-01-21 14:13:55 -04:00
Joey Hess	73df633a62	omit inode from ContentIdentifier for directory special remote Directory special remotes with importtree=yes now avoid unncessary overhead when inodes of files have changed, as happens whenever a FAT filesystem gets remounted. A few unusual edge cases of modifications won't be detected and imported. I think they're unusual enough not to be a concern. It would be possible to add a config setting that controls whether to compare inodes too, but does not seem worth bothering the user about currently. I chose to continue to use the InodeCache serialization, just with the inode zeroed. This way, if I later change my mind or make it configurable, can parse it back to an InodeCache and operate on it. The overhead of storing a 0 in the content identifier log seems worth it. There is a one-time cost to this change; all directory special remotes with importtree=yes will re-hash all files once, and will update the content identifier logs with zeroed inodes. This commit was sponsored by Brett Eisenberg on Patreon.	2021-01-19 13:15:07 -04:00
Joey Hess	7eb54bad12	fix prop_relPathDirToFileAbs_basics fail on windows It was just slapping on a path separator to the front of the path to make it absolute, but on windows, a path like "//foo/bar" actually has a network "drive" of "//foo" and so that broke the test case. Since "a:foo" is a somehow relative path on windows (who knows how), drop any drive from the input. But dropDrive also drops any leading path separator, making the input path relative. So now it should be safe to slapp on a leading path separator.	2021-01-18 13:26:10 -04:00
Joey Hess	fb921cd0b0	fix build warning	2021-01-13 14:48:41 -04:00
Joey Hess	5e39b7eb8d	Windows: Work around win32 length limits when dealing with lock files	2021-01-13 14:38:35 -04:00
Joey Hess	bb4dc3a399	export TestableFilePath constructor Useful for eg, replicating failures in ghci. No need for this to be a smart constructor, as long as it's used with valid filepaths, it's ok and if not the test breaks.	2021-01-13 13:23:35 -04:00
Joey Hess	99ba471209	rewrite prop_relPathDirToFileAbs_basics This was not a good test, it broke the requirement that relPathDirToFileAbs take absolute paths. And it failed when the two input paths were eg, the same but differently normalized. Replaced with some tests of the real basics of that function.	2021-01-13 13:23:26 -04:00
Joey Hess	6b13574827	Windows: include= and exclude= containing '/' will also match filenames that are written using '\' And vice-versa, but it's better to use '/' for portability. Notably, standardPreferredContent contains "archive/*" and that might not match if the filename ends up coming in with the slashes the other way around.	2020-12-15 12:39:34 -04:00
Joey Hess	94b323a8e8	use TotalSize more extensively	2020-12-11 12:10:43 -04:00
Joey Hess	447d798987	export encode_c'	2020-12-09 15:28:45 -04:00
Joey Hess	19777d1c6f	minor improvements Adding new instance for Integer, and some parsers for more parameters. The conversion of readish to readMaybe is done because a serialized exit code cannot contain additional text after the number.	2020-12-09 15:28:11 -04:00
Joey Hess	41f2c308ff	stall detection is working New config annex.stalldetection, remote.name.annex-stalldetection, which can be used to deal with remotes that stall during transfers, or are sometimes too slow to want to use. This commit was sponsored by Luke Shumaker on Patreon.	2020-12-08 15:22:18 -04:00
Joey Hess	794fc72afb	avoid parseDuration succeeding on empty string	2020-12-08 12:51:56 -04:00
Joey Hess	72e5764a87	move TransferrerPool from assistant This old code will now be useful for git-annex beyond the assistant. git-annex won't use the CheckTransferrer part, and won't run transferkeys as a batch process, and will want withTransferrer to not shut down transferkeys processes. Still, the rest of this is a good fit for what I need now. Also removed some dead code, and simplified a little bit. This commit was sponsored by Mark Reidenbach on Patreon.	2020-12-07 12:50:48 -04:00
Joey Hess	e5b170aa1c	switch back to POSIXTime turned out not to need Read MeterState	2020-12-04 13:54:33 -04:00
Joey Hess	5a41e46bd4	start on serializing Messages Json objects not yet handled, and some other special cases, but this is the bulk of the messages. For progress meters, POSIXTime does not have a Read instance (or a suitable Show instance), so had to switch to using a Double for progress meters. This commit was sponsored by Ethan Aubin on Patreon.	2020-12-03 13:03:03 -04:00
Joey Hess	92136284b1	avoid hGetMetered 0 closing the handle This is an edge case, which happened to be triggered by the P2P protocol seeing DATA 0. When reading 0 bytes, getting an empty string does not mean the handle has reached EOF. I verified there was in fact a bug, where get of an empty file followed by another file would get the empty file and then fail with "handle is closed". This fixes it. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-12-01 15:39:22 -04:00
Joey Hess	a3b714ddd9	finish fixing removeLink on windows `9cb250f7be` got the ones in RawFilePath, but there were others that used the one from unix-compat, which fails at runtime on windows. To avoid this, import System.PosixCompat.Files hiding removeLink This commit was sponsored by Ethan Aubin.	2020-11-24 13:20:44 -04:00
Joey Hess	dce0781391	squash remaining build warnings on windows	2020-11-24 12:35:09 -04:00
Joey Hess	804808d569	squash build warnings on windows	2020-11-23 14:00:17 -04:00
Joey Hess	b13c44cccc	convert processTranscript to use hGetLineUntilExitOrEOF It does use it on both stdout and stderr. It seems unlikely the problem could really affect stdout, but the unix implementation of it combines both into a single handle in any case.	2020-11-19 16:36:37 -04:00
Joey Hess	ff0927bde9	converted reads from stderr to use hGetLineUntilExitOrEOF These are all unlikely to suffer from the inherited stderr fd problem, but who knows, it could happen.	2020-11-19 16:21:17 -04:00
Joey Hess	728afbc4b1	preserve other headers when adding resume header It had lost the hAcceptEncoding header that is set as part of the overriding of http-client's default decompression of compressed files. Seems likely that would have caused resuming of compressed files to fail in some cases. This commit was sponsored by Brett Eisenberg on Patreon.	2020-11-19 14:45:22 -04:00
Joey Hess	b90b9b936d	don't rely on exception for http 416 Fix a bug that could make resuming a download from the web fail when the entire content of the file is actually already present locally. What a mess that Request can throw exceptions or not, depending on how it's configured. Makes it very hard if you need to handle some specific http status codes in a function like this! Implementing everything two ways did not seem appealing, if possible at all, so I decided to override the Request if it did come configured to throw exception on non-2xx http status. Other exceptions, like from http-client-restricted, or due to a redirect to a non-http url, still get thrown. This commit was sponsored by Luke Shumaker on Patreon.	2020-11-19 14:44:42 -04:00
Joey Hess	4b739fc460	Fix build on Windows Thanks to bug reporter for the patch.	2020-11-19 12:33:00 -04:00
Joey Hess	9cb250f7be	fix removeLink on windows This removeLink was introduced in commit `e505c03bcc`, which replaced code that used removeFile on Windows. So, I know git-annex did not used to do anything other than removeFile on Windows. If there were symlinks it wanted to remove, this would not work on windows, but of course it does not use symlinks on windows.	2020-11-19 12:20:18 -04:00
Joey Hess	682829c200	avoid throwing exception when the handle is closed The handle could get closed eg, by cleanupProcess being called, which forces the process to exit and closes all its handles. At this point, the test case in https://git-annex.branchable.com/bugs/Buggy_external_special_remote_stalls_after_7245a9e/ is fixed.	2020-11-18 15:10:35 -04:00
Joey Hess	b021e2322f	avoid crash on EOF at end	2020-11-18 15:03:30 -04:00
Joey Hess	e6d741af79	finish conversion to hGetLineUntilExitOrEOF started in `aafae46bcb`	2020-11-18 14:54:02 -04:00
Joey Hess	b483be8548	newline mode (mis)handling for windows Unfortunately, there is no hGetNewLineMode. This seems like an oversight that should be fixed in ghc, but for now, I paper over it with a windows hack.	2020-11-18 14:48:50 -04:00
Joey Hess	787b39c7c1	working hGetLineUntilExitOrEOF The problem with the old version seemed to be that hWaitForInput blocks rather than timing out when being run concurrently with hGetLine on the same handle. This passes the bench test, and also works when run concurrently on different handles.	2020-11-18 14:21:47 -04:00
Joey Hess	9af0000e0f	bench test for hGetLineUntilExitOrEOF This seems to show that hWaitForInput does not seem to behave as documented. It does not time out, so blocks forever in this situation. This is with a 0 timeout and with larger timeouts. Unsure why, it looked like it should work.	2020-11-18 12:23:15 -04:00
Joey Hess	aafae46bcb	WIP for https://git-annex.branchable.com/bugs/Buggy_external_special_remote_stalls_after_7245a9e/	2020-11-17 17:31:08 -04:00
Joey Hess	bce8865824	fix build on windows	2020-11-17 11:58:45 -04:00
Joey Hess	49a14b16a1	fix build on windows	2020-11-16 09:31:45 -04:00
Joey Hess	ed7afabdb1	fix build on windows	2020-11-13 13:34:28 -04:00
Joey Hess	f240f0196c	fix build on windows	2020-11-12 11:39:29 -04:00
Joey Hess	6911d27d42	fix build on windows	2020-11-11 08:51:42 -04:00
Joey Hess	4c1eb28c40	fix build on windows	2020-11-10 11:21:03 -04:00
Joey Hess	885974be99	add newtypes for QuickCheck to avoid LANG=C issues All properties changed to use them, except for prop_encode_c_decode_c_roundtrip, which already filtered to ascii for other reasons. A few modules had to be split out, because Setup does not build-depend on QuickCheck.	2020-11-09 20:21:18 -04:00
Joey Hess	dd52d8ebdc	update after RawFilePath transition	2020-11-09 12:12:25 -04:00
Joey Hess	907a0bcad6	avoid providing filename with NUL to quickcheck properties instance Arbitrary [Char] allows that, and it's not a legal part of a filename so can break processing them. Noticed when prop_view_roundtrips failed. The instance Arbitrary AssociatedFile avoids this problem. This commit was sponsored by Mark Reidenbach on Patreon.	2020-11-06 15:15:33 -04:00
Joey Hess	1db49497e0	finished this stage of the RawFilePath conversion This commit was sponsored by Denis Dzyubenko on Patreon.	2020-11-06 14:10:58 -04:00
Joey Hess	2c8cf06e75	more RawFilePath conversion Converted file mode setting to it, and follow-on changes. Compiles up through 369/646. This commit was sponsored by Ethan Aubin.	2020-11-05 18:45:37 -04:00
Joey Hess	9b0dde834e	convert getFileSize to RawFilePath Lots of nice wins from this in avoiding unncessary work, and I think nothing got slower. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-11-05 11:32:57 -04:00
Joey Hess	5a1e73617d	finished this stage of the RawFilePath conversion Finally compiles again, and test suite passes. This commit was sponsored by Brock Spratlen on Patreon.	2020-11-04 14:20:37 -04:00
Joey Hess	eb42cd4d46	more RawFilePath conversion 535/645 This commit was sponsored by Brett Eisenberg on Patreon.	2020-11-03 10:11:04 -04:00
Joey Hess	b724236b35	remove unused imports	2020-11-02 15:36:11 -04:00
Joey Hess	f45ad178cb	more RawFilePath conversion At 318/645 after 4k lines of changes This commit was sponsored by Jake Vosloo on Patreon.	2020-10-29 12:03:50 -04:00
Joey Hess	e505c03bcc	more RawFilePath conversion nukeFile replaced with removeWhenExistsWith removeLink, which allows using RawFilePath. Utility.Directory cannot use RawFilePath since setup does not depend on posix. This commit was sponsored by Graham Spencer on Patreon.	2020-10-29 10:50:29 -04:00
Joey Hess	8d66f7ba0f	more RawFilePath conversion Added a RawFilePath createDirectory and kept making stuff build. Up to 296/645 This commit was sponsored by Mark Reidenbach on Patreon.	2020-10-28 17:25:59 -04:00
Joey Hess	6c29817748	RawFilePath version of getCurrentDirectory This commit was sponsored by Jochen Bartl on Patreon	2020-10-28 16:03:45 -04:00
Joey Hess	08cbaee1f8	more RawFilePath conversion Most of Git/ builds now. Notable win is toTopFilePath no longer double converts This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-10-28 15:55:30 -04:00
Joey Hess	d6e94a6b2e	got configure working after Utility.Path ByteString conversion Had to split out some modules because getWorkingDirectory needs unix, which is not a build-dep of configure. This commit was sponsored by Brock Spratlen on Patreon.	2020-10-28 15:01:19 -04:00
Joey Hess	e219aadbab	convert to RawByteString This will break a lot of stuff that uses it, but once fixed should lead to better performance. Mostly mechanical. Changes of note: * upFrom now uses isPathSeparator, which is better on Windows where there is not just one * splitShortExtensions used to take the length of a string, which would count wide unicode characters as a single character. Changing to B.length changes that. Note that, git-annex's annexMaxExtensionLength already changed to the length in bytes before this change. This function is only used in generating views, and the small behavior change should not be a problem. * relHome still uses FilePath because it didn't seem worth changing(?) This commit was sponsored by Jack Hill on Patreon.	2020-10-28 14:32:45 -04:00
Joey Hess	9a5cd96f0d	Fix a memory leak introduced in the last release The problem was this line: cleanup = and <$> sequence (map snd v) That caused all of v to be held onto until the end, when the cleanup action was run. I could not seem to find a bang pattern that avoided the leak, so I resorted to a IORef, rather clunky, but not a performance problem because it will only be written once per git ls-files, so typically just 1 time. This commit was sponsored by Mark Reidenbach on Patreon.	2020-10-13 16:31:01 -04:00
Joey Hess	d54dd0ef9c	Fix build on Windows with network-3 inet_addr was removed, but all this needs is localhost, so hardcoding it should work fine. It may be that this windows ifdef is no longer needed. It was added in 2013 with a note that getAddrInfo didn't work on windows, but it seems likely such a problem would have been fixed since.	2020-10-08 10:50:39 -04:00
Joey Hess	fd81dd912b	change from deprecated and removed aNY_PORT to defaultPort (Both are just 0 internally.)	2020-10-08 10:35:16 -04:00
Joey Hess	30e3a2e4c4	remove unused define	2020-10-08 10:31:03 -04:00
Joey Hess	4c32499e82	Parse youtube-dl progress output Which lets progress be displayed when doing concurrent downloads. Amoung other things, like --json-progress etc. The youtube-dl output is no longer displayed, except for any errors. This commit was sponsored by Denis Dzyubenko on Patreon.	2020-09-29 17:53:48 -04:00
Joey Hess	15c1ee16d9	import --no-content: Check annex.largefiles Import small files into git, the same as is done when importing with content. Which means, for small files, --no-content does download them. If the largefiles expression needs the file content available (due to mimetype or mimeencoding being used), the import will fail. This commit was sponsored by Jake Vosloo on Patreon.	2020-09-28 13:28:57 -04:00
Joey Hess	f624876dc2	remove zombie process in file seeking This was the last one marked as a zombie. There might be others I don't know about, but except for in the hypothetical case of a thread dying due to an async exception before it can wait on a process it started, I don't know of any. It would probably be safe to remove the reapZombies now, but let's wait and so that in its own commit in case it turns out to cause problems. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-09-25 11:38:42 -04:00
Joey Hess	5117ae8aec	fix build warning	2020-09-25 11:07:41 -04:00
Joey Hess	c1b4d76e6b	make MatchFiles introspectable matchNeedsFileContent is not used yet, but shows how to add information about terminals. That one would be needed for https://git-annex.branchable.com/todo/sync_fast_import/ Note the tricky bit in Annex.FileMatcher.call where it folds over the included matcher to propagate the information. This commit was sponsored by Svenne Krap on Patreon.	2020-09-24 14:01:53 -04:00
Joey Hess	68f9766544	Improve --debug output to show pid of processes that are started and stopped getPid returns Nothing if the process has already been stopped, and in that case, the pid will not be displayed. I think that would only happen if waitForProcess or similar gets called more than once on the same process handle though. getPid on unix has an overhead of only a MVar read. On Windows it needs to make a syscall, so will be probably more expensive. While the added expense happens even when debug logging is disabled, it should be small enough compared with the overhead of starting a process that it's not a problem. (It does occur to me that a debugM that took an IO String could only run it when debugging is really enabled, which would improve performance. It does not seem possible to use the current hslogger interface to do that though; it does not expose the information that would be needed.)	2020-09-24 12:39:57 -04:00

1 2 3 4 5 ...

1685 commits