git-annex

Author	SHA1	Message	Date
Joey Hess	cbd138e042	factor out Utility.Aeson.textKey	2022-03-02 18:24:06 -04:00
sternenseemann	ca596e7c54	allow building with aeson >= 2.0 In aeson 2.0, Text has been replaced by the Key type and HashMap by the KeyMap interface. Accomodating this required adding some CPP in order to still be able to compile with aeson < 2.0. The required changes were: * Prevent Key from being re-exported by Utilities.Aeson, as it clashes with git-annex's own Key type. * Fix up convertion from String/Text to Key (or Text in aeson 1.) in a couple of places Import Data.Aeson.KeyMap instead of Data.HashMap.Strict, as they are mostly API-compatible. insertWith needs to be replaced by unionWith, however, as KeyMap lacks the former function.	2022-03-02 18:01:41 -04:00
Joey Hess	438e5b56aa	tighter --json parsing for metadata metadata --batch --json: Reject input whose "fields" does not consist of arrays of strings. Such invalid input used to be silently ignored. Used to be that parseJSON for a JSONActionItem ran parseJSON separately for the itemAdded, and if that failed, did not propagate the error. That allowed different items with differently named fields to be parsed. But it was actually only used to parse "fields" for metadata, so that flexability is not needed. The fix is just to parse "fields" as-is. AddJSONActionItemFields is needed only because of the wonky way Command.MetaData adds onto the started json object. Note that this line got a dummy type signature added, just because the type checker needs it to be some type. itemFields = Nothing :: Maybe Bool Since it's Nothing, it doesn't really matter what type it is, and the value gets turned into json and is then thrown away. Sponsored-by: Kevin Mueller on Patreon	2021-11-01 14:42:37 -04:00
Joey Hess	18e00500ce	bwlimit Added annex.bwlimit and remote.name.annex-bwlimit config that works for git remotes and many but not all special remotes. This nearly works, at least for a git remote on the same disk. With it set to 100kb/1s, the meter displays an actual bandwidth of 128 kb/s, with occasional spikes to 160 kb/s. So it needs to delay just a bit longer... I'm unsure why. However, at the beginning a lot of data flows before it determines the right bandwidth limit. A granularity of less than 1s would probably improve that. And, I don't know yet if it makes sense to have it be 100ks/1s rather than 100kb/s. Is there a situation where the user would want a larger granularity? Does granulatity need to be configurable at all? I only used that format for the config really in order to reuse an existing parser. This can't support for external special remotes, or for ones that themselves shell out to an external command. (Well, it could, but it would involve pausing and resuming the child process tree, which seems very hard to implement and very strange besides.) There could also be some built-in special remotes that it still doesn't work for, due to them not having a progress meter whose displays blocks the bandwidth using thread. But I don't think there are actually any that run a separate thread for downloads than the thread that displays the progress meter. Sponsored-by: Graham Spencer on Patreon	2021-09-21 16:58:10 -04:00
Joey Hess	51c696679f	avoid using temp file size when deciding whether to retry failed transfer When stall detection is enabled, and a transfer is in progress, it would display a doubled message: (transfer already in progress, or unable to take transfer lock) (transfer already in progress, or unable to take transfer lock) That happened because the forward retry decider had a start size of 0, and an end size of whatever amount of the object the other process had downloaded. So it incorrectly thought that the transferrer process had made progress, when it had in fact immediately given up with that message. Instead, use the reported value from the progress meter. If a remote does not report progress, this will mean it doesn't forward retry, in a situation where it used to. But most remotes do report progress, and any remote that does not can be fixed to, by using watchFileSize when downloading. Also, some remotes might preallocate the temp file (eg bittorrent), so relying on statting its size at this level to get progress is dubious. The same change was made to Annex/Transfer.hs, although only Annex/TransferrerPool.hs needed to be changed to avoid the duplicate message. (An alternate fix would have been to start the retry decider with the size of the object file before downloading begins, rather than 0.) Sponsored-by: Brett Eisenberg on Patreon	2021-06-25 12:04:23 -04:00
Joey Hess	7c7225e257	clear progress bar before displaying messages In particular this clears it before "transfer stalled". If an external special remote uses showInfo while progress is displayed, it will also improve display of that. Generally this will avoid all such problems in the future.. Sponsored-by: Svenne Krap on Patreon	2021-06-15 20:51:40 -04:00
Joey Hess	7b6deb1109	display scanning message whenever reconcileStaged has enough files to chew on Clear visible progress bar first. Removed showSideActionAfter because it can't be used in reconcileStaged (import loop). Instead, it counts the number of files it processes and displays it after it's seen a sufficient to know it's taking a while. Sponsored-by: Dartmouth College's Datalad project	2021-06-08 12:48:30 -04:00
Joey Hess	0c46ee5ce0	simplify transferr protocol	2020-12-11 12:52:22 -04:00
Joey Hess	095cdc7e83	extend transferrer protocol to send progress bar total size updates New protocol is not back-compat with old one, but it's never been released so that's ok.	2020-12-11 12:42:28 -04:00
Joey Hess	94b323a8e8	use TotalSize more extensively	2020-12-11 12:10:43 -04:00
Joey Hess	41f2c308ff	stall detection is working New config annex.stalldetection, remote.name.annex-stalldetection, which can be used to deal with remotes that stall during transfers, or are sometimes too slow to want to use. This commit was sponsored by Luke Shumaker on Patreon.	2020-12-08 15:22:18 -04:00
Joey Hess	438d5be1f7	support prompt in message serialization That seems to be the last thing needed for message serialization. Although it's only used in the assistant currently, so hard to tell if I forgot something. At this point, it should be possible to start using transferkeys when performing transfers, which will allow killing a transferkeys process if a transfer times out or stalls. But that's for another day. This commit was sponsored by Ethan Aubin.	2020-12-04 14:54:09 -04:00
Joey Hess	31e417f351	finish message serialization of progress meters Any given transfer can only display 1 progress meter at a time, or so this code assumes. In some cases, there are progress meters for different stages of a transfer, perhaps, and that is supported by this. This commit was sponsored by Ethan Aubin.	2020-12-04 13:50:46 -04:00
Joey Hess	4efecaebd6	generalize to allow running in Assistant monad	2020-12-04 13:07:30 -04:00
Joey Hess	cad147cbbf	new protocol for transferkeys, with message serialization Necessarily threw out the old protocol, so if an old git-annex assistant is running, and starts a transferkeys from the new git-annex, it would fail. But, that seems unlikely; the assistant starts up transferkeys processes and then keeps them running. Still, may need to test that scenario. The new protocol is simple read/show and looks like this: TransferRequest Download (Right "origin") (Key {keyName = "f8f8766a836fb6120abf4d5328ce8761404e437529e997aaa0363bdd4fecd7bb", keyVariety = SHA2Key (HashSize 256) (HasExt True), keySize = Just 30, keyMtime = Nothing, keyChunkSize = Nothing, keyChunkNum = Nothing}) (AssociatedFile (Just "foo")) TransferOutput (ProgressMeter (Just 30) (MeterState {meterBytesProcessed = BytesProcessed 0, meterTimeStamp = 1.6070268727892535e9}) (MeterState {meterBytesProcessed = BytesProcessed 30, meterTimeStamp = 1.6070268728043e9})) TransferOutput (OutputMessage "(checksum...) ") TransferResult True Granted, this is not optimally fast, but it seems good enough, and is probably nearly as fast as the old protocol anyhow. emitSerializedOutput for ProgressMeter is not yet implemented. It needs to somehow start or update a progress meter. There may need to be a new message that allocates a progress meter, and then have ProgressMeter update it. This commit was sponsored by Ethan Aubin	2020-12-03 16:21:20 -04:00
Joey Hess	e7f42e2ec7	when serializing messages, include json objects This is done always, it's up to the comsumer to decide if it wants to output the json objects or the messages. Messages.JSON.finalize changed to not need a JSONOptions. As far as I can see, this does not change its behavior, since addErrorMessage appends to any list that's already there. This commit was sponsored by Ethan Aubin.	2020-12-03 14:47:04 -04:00
Joey Hess	5a41e46bd4	start on serializing Messages Json objects not yet handled, and some other special cases, but this is the bulk of the messages. For progress meters, POSIXTime does not have a Read instance (or a suitable Show instance), so had to switch to using a Double for progress meters. This commit was sponsored by Ethan Aubin on Patreon.	2020-12-03 13:03:03 -04:00
Joey Hess	aafae46bcb	WIP for https://git-annex.branchable.com/bugs/Buggy_external_special_remote_stalls_after_7245a9e/	2020-11-17 17:31:08 -04:00
Joey Hess	9b0dde834e	convert getFileSize to RawFilePath Lots of nice wins from this in avoiding unncessary work, and I think nothing got slower. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-11-05 11:32:57 -04:00
Joey Hess	fcf5d11c63	add "input" field to json output The use case of this field is mostly to support -J combined with --json. When that is implemented, a user will be able to look at the field to determine which of the requests they have sent it corresponds to. The field typically has a single value in its list, but in some cases mutliple values (eg 2 command-line params) are combined together and the list will have more. Note that json parsing was already non-strict, so old git-annex metadata --json --batch can be fed json produced by the new git-annex and will not stumble over the new field.	2020-09-15 16:22:44 -04:00
Joey Hess	f85ca7dc80	fix all remaining -Wincomplete-uni-patterns warnings A couple of these were probably actual bugs in edge cases. Most of the changes I'm fine with. The fact that aeson's object returns sometihng that we know will be an Object, but the type checker does not know is kind of annoying.	2020-04-15 13:55:08 -04:00
Joey Hess	6f90bb7738	handle git-credential prompt in -J mode If git-credential has it cached and does not prompt, this will unfortunately result in a brief flicker, as the displayed console regions are hidden while running it and then re-displayed. Better than a corrupted display. Actually, I tried it and don't see a visible flicker, so probably only over a slow ssh will it be apparent.	2020-01-22 16:42:15 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	42c386fc47	add: Display progress meter when hashing files. * add: Display progress meter when hashing files. * add: Support --json-progress option.	2019-06-25 13:12:47 -04:00
Joey Hess	26c54d6ea3	make metered more generic Allow it to be used when the Key is not known.	2019-06-25 12:33:36 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	19372e47ea	Fix build without concurrent-output. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-12-03 12:33:00 -04:00
Joey Hess	872af2b2f1	avoid using concurrent-output at all when --quiet or --json Of course, it wasn't used much in those modes, because normal output is avoided. But it was still initialized and used in a few places, including a call to hideRegionsWhile.	2018-11-15 14:26:40 -04:00
Joey Hess	38d691a10f	removed the old Android app Running git-annex linux builds in termux seems to work well enough that the only reason to keep the Android app would be to support Android 4-5, which the old Android app supported, and which I don't know if the termux method works on (although I see no reason why it would not). According to [1], Android 4-5 remains on around 29% of devices, down from 51% one year ago. [1] https://www.statista.com/statistics/271774/share-of-android-platforms-on-mobile-devices-with-android-os/ This is a rather large commit, but mostly very straightfoward removal of android ifdefs and patches and associated cruft. Also, removed support for building with very old ghc < 8.0.1, and with yesod < 1.4.3, and without concurrent-output, which were only being used by the cross build. Some documentation specific to the Android app (screenshots etc) needs to be updated still. This commit was sponsored by Brett Eisenberg on Patreon.	2018-10-13 01:41:11 -04:00
Joey Hess	89e1a05a8f	Fix mangling of --json output of utf-8 characters when not running in a utf-8 locale As long as all code imports Utility.Aeson rather than Data.Aeson, and no Strings that may contain utf-8 characters are used for eg, object keys via T.pack, this is guaranteed to fix the problem everywhere that git-annex generates json. It's kind of annoying to need to wrap ToJSON with a ToJSON', especially since every data type that has a ToJSON instance has to be ported over. However, that only took 50 lines of code, which is worth it to ensure full coverage. I initially tried an alternative approach of a newtype FileEncoded, which had to be used everywhere a String was fed into aeson, and chasing down all the sites would have been far too hard. Did consider creating an intentionally overlapping instance ToJSON String, and letting ghc fail to build anything that passed in a String, but am not sure that wouldn't pollute some library that git-annex depends on that happens to use ToJSON String internally. This commit was supported by the NSF-funded DataLad project.	2018-04-16 16:21:21 -04:00
Joey Hess	bd45129c27	always poll file This is now only used when downloading an url, and polling is always needed when using curl, no matter how the output is configured.	2018-04-06 23:09:19 -04:00
Joey Hess	d8702c5259	dial down update frequency for progress display to 0.2s 0.1s seemed a bit too jumpy. But kept 0.1s for --json-progress in case some consumer depends on it. BTW, the way rsync handles it is one progress update after every chunk the rolling checksum algo identifies. So it's not a fixed delay. Weird.	2018-03-13 18:32:18 -04:00
Joey Hess	e16b069331	use total size from DATA Noticed that getting a key whose size is not known resulted in a progress display that didn't include the percent complete. Fixed for P2P by making the size sent with DATA be used to update the meter's total size. In order for rateLimitMeterUpdate to also learn the total size, had to make it be passed the Meter, and some other reorg in Utility.Metered was also done so that --json-progress can construct a Meter to pass to rateLimitMeterUpdate. When the fallback rsync is done, the progress display still doesn't include the percent complete. Only way to fix that seems to be to let rsync display its output again, but that would conflict with git-annex's own progress meter, which is also being displayed. This commit was sponsored by Henrik Riomar on Patreon.	2018-03-12 21:46:58 -04:00
Joey Hess	cb05ef06bf	fix lost metering for fallback rsyncs `08814327ff` accidentially got rid of it, when it removed commandMetered.	2018-03-12 18:22:48 -04:00
Joey Hess	08814327ff	use P2P protocol for checkpresent, retrieve, and store Note that, due to not using rsync to transfer files to ssh remotes any longer, permissions and other file metadata of annexed files will no longer be preserved when copying them to ssh remotes. Other remotes never supported preserving that information, so this is not considered a regression. Added NEWS item about this. Another significant side effect of this is that, even when rsync is run to retrieve a file, its progress display will no longer be shown, and instead the native git-annex progress display will appear. It would be possible to use the rsync process display when rsync is used (old git-annex-shell and also retrieval from a local repository), but it would have complicated the code unncessarily, and been inconsistent behavior. (I'd been thinking for a while about eliminating the rsync progress display, since it's got some annoying verbosities, including display of the key and the "(xfr#1, to-chk=0/1)" bit and was already somewhat inconsistent.) retrieveKeyFileCheap still uses rsync, since that ensures that it gets the actual file content from the remote. Using the P2P protocol would use the local content, as long as the local and remote size are the same. This commit was sponsored by John Pellman on Patreon.	2018-03-09 13:25:16 -04:00
Joey Hess	ff134618f0	split msg into lines msg is what would be output to stderr, so has some layout and formatting, which is perhaps not ideal, but let's at least avoid it containing line breaks.	2018-02-19 15:55:00 -04:00
Joey Hess	0292af8520	fix build	2018-02-19 15:39:52 -04:00
Joey Hess	39b59c341f	send stderr to json when --json-error-messages enabled	2018-02-19 15:28:38 -04:00
Joey Hess	63ff670cc5	always include error-messages field when --json-error-messages Always include error-messages field, even if empty, to make the json be self-documenting. This was a design requirement for --json-error-messages. This commit was supported by the NSF-funded DataLad project.	2018-02-19 15:07:00 -04:00
Joey Hess	fa65f1d240	fix --json-progress --json to be same as --json --json-progress Fix behavior of --json-progress followed by --json, in which the latter option disabled the former. This commit was supported by the NSF-funded DataLad project.	2018-02-19 14:12:15 -04:00
Joey Hess	7e454ee341	--json: multi-line notes --json: When there are multiple lines of notes about a file, make the note field multiline, rather than the old behavior of only including the last line. Using newlines in the note is perhaps not ideal, but upgrading it to an array in this case would be an annoying inconsistency to need to deal with. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-02-16 13:27:17 -04:00
Joey Hess	7d9f0e0fbe	Added INFO to external special remote protocol. It's left up to the special remote to detect when git-annex is new enough to support the message; an old git-annex will blow up. This commit was supported by the NSF-funded DataLad project.	2018-02-06 13:03:55 -04:00
Joey Hess	f5edb16729	Display progress meter when uploading a key without size information Getting the size by statting the content file. This commit was supported by the NSF-funded DataLad project.	2017-11-14 16:40:49 -04:00
Joey Hess	174ce991f9	typo	2017-05-19 10:57:59 -04:00
Joey Hess	37060c326a	fix build failure w/o concurrent-output Also removed warning on old version, since the autobuilder highlights warnings as errors and this is too minor for that.	2017-05-19 10:25:40 -04:00
Joey Hess	3ec0be7e1a	avoid warning when the concurrent-output flag is disabled	2017-05-16 21:06:34 -04:00
Joey Hess	1d45e47e3f	clear regions before ssh prompt When built with concurrent-output 1.9, ssh password prompts will no longer interfere with the -J display. To avoid flicker, only done when ssh actually does need to prompt; ssh is first run in batch mode and if that succeeds the connection is up and no need to clear regions. This commit was supported by the NSF-funded DataLad project.	2017-05-16 15:50:11 -04:00
Joey Hess	a1730cd6af	adeiu, MissingH Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.	2017-05-16 01:03:52 -04:00

1 2 3

110 commits