This method avoids breaking test_readonly. Just check if the dest file
exists, and avoid CoW probing when it does, so when CoW probing fails,
it can resume where the previous non-CoW copy left off.
If CoW has been probed already to work, delete the dest file
since a CoW copy will presumably work. It seems like it would be almost
as good to just skip CoW copying in this case too, but consider that the
dest file might have started to be copied from some other remote, not
using CoW, but CoW has been probed to work to copy from the current
place.
Sponsored-by: Dartmouth College's Datalad project
commit 63d508e885 broke test_readonly.
When a local git remote is readonly, tryCopyCoW run to copy a file
from it failed at withOtherTmp.
Sponsored-by: Dartmouth College's Datalad project
onAddUnlocked suggested that it had to add the file locked, but it
doesn't, it only queues the add of the file, and it's up to the
committer what to do -- which is currently indeed to always add the file
locked.
Disabling git-annex branch update for this command is
ok, because it does not use any information from the branch,
but only logs the location when it adds a key.
Sponsored-by: Dartmouth College's Datalad project
This avoids starting one process when only the other one is needed.
Eg in git-annex smudge --clean, this reduces the total number of
cat-file processes that are started from 4 to 2.
The only performance penalty is that when both are needed, it has to do
twice as much work to maintain the two Maps. But both are very small,
consisting of 1 or 2 items, so that work is negligible.
Sponsored-by: Dartmouth College's Datalad project
I first saw this getting with -J2 over ssh, but later saw it also
without the -J2. It was resuming, and the calulated unboundDelay was
many minutes. The first update of the meter jumped to some large value,
because of the resuming, and so it thought the BW was super fast.
Avoid by waiting until the second meter update.
Might be a good idea to also guard for the delay being many seconds
and avoid waiting. But how many? If BW is legitimately super fast, and a
remote happens to read more than a 32kb or so chunk at a time, it could
in theory download megabytes or gigabytes of data before the first meter
update. It would actually be appropriate then to delay for a long time,
if the desired BW was low. Could make up some numbers that are sane now,
but tech may improve.
(BTW, pleased to see bwlimit does work with -J. I had worried that
it might not, if the meter update happened in a different thread than
the downloading, but it's done in the same thread.)
Sponsored-by: Brett Eisenberg on Patreon
New method is much better. Avoids unrestrained transfer at the beginning
(except for the first block. Keeps right at or a few kb/s below the
configured limit, with very little varation in the actual reported bandwidth.
Removed the /s part of the config as it's not needed.
Ready to merge.
Sponsored-by: Luke Shumaker on Patreon
RemoteGitConfig parsing looks for annex.bwlimit when a remote
does not have a per-remote config for it, so no need for a separate
gobal config.
Sponsored-by: Svenne Krap on Patreon
RemoteGitConfig parsing looks for annex.stalldetection when a remote
does not have a per-remote config for it, so no need for a separate
gobal config.
Sponsored-by: Noam Kremen on Patreon
Bug fix: Git configs such as annex.verify were incorrectly overriding
per-remote git configs such as remote.name.annex-verify.
This dates all the way back to 2013,
commit 8a5b397ac4,
where hlint apparently somehow confused me into parsing in the wrong
order. Before that it was correct.
Amazing noone has noticed until now.
Sponsored-by: Kevin Mueller on Patreon
Probably this fixes a reversion, but I don't know what version broke it.
This does use withOtherTmp for a temp file that could be quite large.
Though albeit a reflink copy that will not actually take up any space
as long as the file it was copied from still exists. So if the copy cow
succeeds but git-annex is interrupted just before that temp file gets
renamed into the usual .git/annex/tmp/ location, there is a risk that
the other temp directory ends up cluttered with a larger temp file than
later. It will eventually be cleaned up, and the changes of this being
a problem are small, so this seems like an acceptable thing to do.
Sponsored-by: Shae Erisson on Patreon
Added annex.bwlimit and remote.name.annex-bwlimit config that works for git
remotes and many but not all special remotes.
This nearly works, at least for a git remote on the same disk. With it set
to 100kb/1s, the meter displays an actual bandwidth of 128 kb/s, with
occasional spikes to 160 kb/s. So it needs to delay just a bit longer...
I'm unsure why.
However, at the beginning a lot of data flows before it determines the
right bandwidth limit. A granularity of less than 1s would probably improve
that.
And, I don't know yet if it makes sense to have it be 100ks/1s rather than
100kb/s. Is there a situation where the user would want a larger
granularity? Does granulatity need to be configurable at all? I only used that
format for the config really in order to reuse an existing parser.
This can't support for external special remotes, or for ones that
themselves shell out to an external command. (Well, it could, but it
would involve pausing and resuming the child process tree, which seems
very hard to implement and very strange besides.) There could also be some
built-in special remotes that it still doesn't work for, due to them not
having a progress meter whose displays blocks the bandwidth using thread.
But I don't think there are actually any that run a separate thread for
downloads than the thread that displays the progress meter.
Sponsored-by: Graham Spencer on Patreon