git-annex

Author	SHA1	Message	Date
Joey Hess	f85ca7dc80	fix all remaining -Wincomplete-uni-patterns warnings A couple of these were probably actual bugs in edge cases. Most of the changes I'm fine with. The fact that aeson's object returns sometihng that we know will be an Object, but the type checker does not know is kind of annoying.	2020-04-15 13:55:08 -04:00
Joey Hess	686791c4ed	more RawFilePath Remove dup definitions and just use the RawFilePath one. </> etc are enough faster that it's probably faster than building a String directly, although I have not benchmarked.	2019-12-18 17:10:28 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	258a7c5cd1	add Key to all ActionItem constructors	2019-06-06 12:53:24 -04:00
Joey Hess	16a2bed710	avoid build warning on Windows about unused import	2019-05-23 12:15:33 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	9127fe4821	add DebugLocks build flag Using the method described in https://www.fpcomplete.com/blog/2018/05/pinpointing-deadlocks-in-haskell but my own code to implement it, and with callstacks added. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-11-19 15:02:43 -04:00
Joey Hess	2e9f128dea	moved module and relicensed	2018-10-29 23:13:36 -04:00
Joey Hess	10d3b7fc62	Fix reversion introduced in 6.20171214 that caused concurrent transfers to incorrectly fail with "transfer already in progress". Avoid creating transfer info file before transfer lock is created and locked. The wrong order for one thing caused transfer info to be overwritten when a transfer was already in progress. But worse, it caused checkTransfer to see the transfer info, and so lock the transfer lock in order to verify the transfer was not in progress. Which in a concurrent situation, prevented the transferrer from locking the transfer lock, so it failed with "transfer already in progress". Note that the transferinfo command does not lock the transfer lock before creating the transfer info. But, that's only run after recvkey is running, and recvkey does lock the transfer lock, so that seems more or less ok. (Other than being a super complicated legacy mess that the P2P code has mostly obsoleted now.) This commit was supported by the NSF-funded DataLad project.	2018-03-14 18:55:34 -04:00
Joey Hess	24df95f0f6	Fix several places where files in .git/annex/ were written with modes that did not take the core.sharedRepository config into account. git grep writeFile finds some more that might also be problems, but for now I've concentrated on .git/annex/ log files. There are certianly cases where writeFile is not a problem too. This commit was sponsored by mo on Patreon.	2018-01-02 17:25:25 -04:00
Joey Hess	e1ac299ad0	better dup key with -J fix This avoids all the complication about redundant work discussed in the previous try at fixing this. At the expense of needing each command that could have the problem to be patched to simply wrap the action in onlyActionOn once the key is known. But there do not seem to be many such commands. onlyActionOn' should not be used with a CommandStart (or CommandPerform), although the types do allow it. onlyActionOn handles running the whole CommandStart chain. I couldn't immediately see a way to avoid mistken use of onlyActionOn'. This commit was supported by the NSF-funded DataLad project.	2017-10-17 18:48:53 -04:00
Joey Hess	68a49adcda	Improve behavior when -J transfers multiple files that point to the same key After a false start, I found a fairly non-intrusive way to deal with it. Although it only handles transfers -- there may be issues with eg concurrent dropping of the same key, or other operations. There is no added overhead when -J is not used, other than an added inAnnex check. When -J is used, it has to maintain and check a small Set, which should be negligible overhead. It could output some message saying that the transfer is being done by another thread. Or it could even display the same progress info for both files that are being downloaded since they have the same content. But I opted to keep it simple, since this is rather an edge case, so it just doesn't say anything about the transfer of the file until the other thread finishes. Since the deferred transfer action still runs, actions that do more than transfer content will still get a chance to do their other work. (An example of something that needs to do such other work is P2P.Annex, where the download always needs to receive the content from the peer.) And, if the first thread fails to complete a transfer, the second thread can resume it. But, this unfortunately means that there's a risk of redundant work being done to transfer a key that just got transferred. That's not ideal, but should never cause breakage; the same thing can occur when running two separate git-annex processes. The get/move/copy/mirror --from commands had extra inAnnex checks added, inside the download actions. Without those checks, the first thread downloaded the content, and then the second thread woke up and downloaded the same content redundantly. move/copy/mirror --to is left doing redundant uploads for now. It would need a second checkPresent of the remote inside the upload to avoid them, which would be expensive. A better way to avoid redundant work needs to be found.. This commit was supported by the NSF-funded DataLad project.	2017-10-17 17:10:50 -04:00
Joey Hess	099bfc2daf	remove redundant definition of lck	2017-10-03 13:23:10 -04:00
Joey Hess	3cd47f9978	info: Improve cleanup of stale transfer info files. In my git-annex repos, I found some stale transfer info files without lock files. Pass a mode to tryLockExclusive, so it will create the lock file if not present, and so not fail to clean up such transfer info files. Normally, transfer info files are accompanied by a lock file. But, when alwaysRunTransfer is used, the locking can fail and it will still write the transfer info file. Perhaps there are other cases too? Note that mkProgressUpdater's meter writes to the transfer info file too, and it might be possible for that meter to fire after runTransfer has cleaned up. This commit was sponsored by andrea rota.	2017-10-02 13:55:26 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	27eca014be	fix up Read instance incompatability caused by recent commit `9c4650358c` changed the Read instance for Key. I've checked all uses of that instance (by removing it and seeing what breaks), and they're all limited to the webapp, except one. That is GitAnnexDistribution's Read instance. So, `9c4650358c` would have broken upgrades of git-annex from downloads.kitenet.net. Once the .info files there got updated for a new release, old releases would have failed to parse them and never upgraded. To fix this, I found a way to make the .info files that contain GitAnnexDistribution values be readable by the old version of git-annex. This commit was sponsored by Ewen McNeill.	2017-02-24 18:59:12 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00
Joey Hess	c9082cf0e4	move Arbitrary instance to new Types.Transfer module Avoid orphan instance warning	2016-09-05 14:52:06 -04:00
Joey Hess	d17f08afdc	avoid warning about orphan Arbirary instance	2016-09-05 14:51:07 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	aaf1ef268d	convert from Utility.LockPool to Annex.LockPool everywhere	2015-11-12 18:13:37 -04:00
Joey Hess	87f28bb2ea	ignore failure to clean up stale transfer lock file Perhaps due to permissions problem, or perhaps a race with another process also cleaning up.	2015-05-19 23:46:42 -04:00
Joey Hess	9de5cd2966	fix crash in stale transfer lockfile cleanup code Need to differentiate between the lockfile not being locked, and it not existing.	2015-05-19 23:35:24 -04:00
Joey Hess	ecb0d5c087	use lock pools throughout git-annex The one exception is in Utility.Daemon. As long as a process only daemonizes once, which seems reasonable, and as long as it avoids calling checkDaemon once it's already running as a daemon, the fcntl locking gotchas won't be a problem there. Annex.LockFile has it's own separate lock pool layer, which has been renamed to LockCache. This is a persistent cache of locks that persist until closed. This is not quite done; lockContent stil needs to be converted.	2015-05-19 14:09:52 -04:00
Joey Hess	6915b71c57	lock pools to work around non-concurrency/composition safety of POSIX fcntl	2015-05-18 15:57:17 -04:00
Joey Hess	7ebf234616	Stale transfer lock and info files will be cleaned up automatically when get/unused/info commands are run. Deleting lock files is tricky, tricky stuff. I think I got it right!	2015-05-12 20:11:23 -04:00
Joey Hess	643b233860	an optimization that also fixes a reversion This is a little optimisation; avoid loading the info file for the download of the current key when checking for other downloads. The reversion it fixes is sorta strange. `a812d598ef` broke checking for transfers that were already in progress. Indeed, the transfer lock was not held after getTransfers was called. Why? I think it's magic in ghc's handling of getLock and setLock, although it's hard to tell since those functions are almost entirely undocumented as to their semantics. Something, either the RTS (or maybe it's linux?) notices that the same process has taken a lock and is now calling getLock on a FD attached to the same file. So, it drops the lock. So, this optimisation avoids that problematic behavior.	2015-05-12 18:34:49 -04:00
Joey Hess	a812d598ef	Take space that will be used by running downloads into account when checking annex.diskreserve.	2015-05-12 15:20:22 -04:00
Joey Hess	6cf62a9bde	support time-1.5.0 This no longer uses old-locale's defaultTimeLocale, but provides one of its own. Factored out a Logs.TimeStamp.	2015-05-10 15:21:35 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	59eae904b1	final scary locking refactoring (for now) Note that while before checkTransfer this called getLock with WriteLock, getLockStatus's use of ReadLock will also notice any exclusive locks. Since transfer info files are only locked exclusively, never shared, there is no behavior change. Also, fixes checkLocked to actually return Just False when the file exists, but is not locked.	2014-08-20 19:30:40 -04:00
Joey Hess	ec7dd0446a	more lock file refactoring	2014-08-20 17:03:04 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	c390e896d1	fix windows build (and make --stop work on windows, incidentially) The Utility.PID will clean up other code soon.	2014-02-11 15:25:59 -04:00
Joey Hess	1572c460e8	avoid using openFile when withFile can be used Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.	2014-02-03 10:19:06 -04:00
Joey Hess	eefda291c6	fix warning	2014-01-28 14:43:20 -04:00
Joey Hess	891c85cd88	use locking on Windows This is all the easy cases, where there was already a separate lock file.	2014-01-28 14:42:03 -04:00
Joey Hess	f931272681	syntax	2013-12-11 00:18:58 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	ecd42aef8e	different PID types for Unix and Windows Windows has a larger (unsigned) PID space, so cannot use the unix CInt there. Note that TransferInfo does not yet ever get the TransferPid populated, as there is missing locking.	2013-12-10 23:48:42 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	4dc4a9a385	assistant: Clear the list of failed transfers when doing a full transfer scan. This prevents repeated retries to download files that are not available, or are not referenced by the current git tree. This is motivated by a user report that the assistant was repeatedly retrying transfers of files that had been deleted (in direct mode, so removing the only copy). Note that the glacier code retries failed transfers after a while to retry downloads that have aged long enough to be available. This is ok; if we're doing a full transfer scan we'll retry on every file that is still in the git tree. Also note that this makes the assistant less likely to get every file referenced by old revs of the git tree. Not something the assistant tries to ensure anyway, so I feel this is acceptable.	2013-09-25 11:46:17 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	ec8cf85fcc	display "transfer already in progress" as a note	2013-07-17 16:16:17 -04:00

1 2 3

116 commits