Benchmarking this with 1000 small files being copied, the time reduced from
15.98s to 14.64s -- an 8% improvement in the non-data-transfer overhead of
git-annex copy.
I think both of these are all that's affected, but I went ahead and fixed
all the remotes that set their config to M.empty to instead store the
actual config. Who knows what will expect it to be actually present in
future, the Remote instance of getGpgEncParams came to..
Currently only implemented for local git remotes. May try to add support
to git-annex-shell for ssh remotes later. Could concevably also be
supported by some special remote, although that seems unlikely.
Cronner user this when available, and when not falls back to
fsck --fast --from remote
git annex fsck --from does not itself use this interface.
To do so, I would need to pass --fast and all other options that influence
fsck on to the git annex fsck that it runs inside the remote. And that
seems like a lot of work for a result that would be no better than
cd remote; git annex fsck
This may need to be revisited if git-annex-shell gets support, since it
may be the case that the user cannot ssh to the server to run git-annex
fsck there, but can run git-annex-shell there.
This commit was sponsored by Damien Diederen.
addurl: Improve message when adding url with wrong size to existing file.
Before the message suggested the url didn't exist.
Fixed handling of URL keys that have no recorded size. Before, if the key
has no size, the url also had to not declare any size, which was unlikely
and wrong, or it was taken to not exist. This probably would mostly affect
keys that were added to the annex with addurl --relaxed.
recvkey was told it was receiving a HMAC key from a direct mode repo,
and that confused it into rejecting the transfer, since it has no way to
verify a key using that backend, since there is no HMAC backend.
I considered making recvkey skip verification in the case of an unknown
backend. However, that could lead to bad results; a key can legitimately be
in the annex with a backend that the remote git-annex-shell doesn't know
about. Better to keep it rejecting if it cannot verify.
Instead, made the gcrypt special remote not set the direct mode flag when
sending (and receiving) files.
Also, added some recvkey messages when its checks fail, since otherwise
all that is shown is a confusing error message from rsync when the remote
git-annex-shell exits nonzero.
Overridable with --user-agent option.
Not yet done for S3 or WebDAV due to limitations of libraries used --
nether allows a user-agent header to be specified.
This commit sponsored by Michael Zehrer.
Now can tell if a repo uses gcrypt or not, and whether it's decryptable
with the current gpg keys.
This closes the hole that undecryptable gcrypt repos could have before been
combined into the repo in encrypted mode.
To support this, a core.gcrypt-id is stored by git-annex inside the git
config of a local gcrypt repository, when setting it up.
That is compared with the remote's cached gcrypt-id. When different, a
drive has been changed. git-annex then looks up the remote config for
the uuid mapped from the core.gcrypt-id, and tweaks the configuration
appropriately. When there is no known config for the uuid, it will refuse to
use the remote.
This is a git-remote-gcrypt encrypted special remote. Only sending files
in to the remote works, and only for local repositories.
Most of the work so far has involved making initremote work. A particular
problem is that remote setup in this case needs to generate its own uuid,
derivied from the gcrypt-id. That required some larger changes in the code
to support.
For ssh remotes, this will probably just reuse Remote.Rsync's code, so
should be easy enough. And for downloading from a web remote, I will need
to factor out the part of Remote.Git that does that.
One particular thing that will need work is supporting hot-swapping a local
gcrypt remote. I think it needs to store the gcrypt-id in the git config of the
local remote, so that it can check it every time, and compare with the
cached annex-uuid for the remote. If there is a mismatch, it can change
both the cached annex-uuid and the gcrypt-id. That should work, and I laid
some groundwork for it by already reading the remote's config when it's
local. (Also needed for other reasons.)
This commit was sponsored by Daniel Callahan.
I thought at first this was a Windows specific problem, but it's not;
this affects checking any non-bare repository exported via http. Which is
a potentially important use case!
The actual bug was the case where Right False was returned by the first url
short-curcuited later checks. But the whole method used felt like code
I'd no longer write, and the use of undefined was particularly disgusting.
So I rewrote it.
Also added an action display.
This commit was sponsored by Eric Hanchrow. Thanks!
annexLocations uses OS-native directory separators, but for an url,
it needs to use / even on Windows.
This is an ugly workaround. Could parameterize a lot of stuff in
annexLocations to fix it better. I suspect this is probably the only place
it's needed though.
Made fromDirect check that a file in the tree has good content (and is not
a broken symlink either) before copying it to another file that has the
same key.
Made replaceFile clean up the temp file if the action that creates it, or
the file replacement action fails.
Most remotes have meters in their implementations of retrieveKeyFile
already. Simply hooking these up to the transfer log makes that information
available. Easy peasy.
This is particularly valuable information for encrypted remotes, which
otherwise bypass the assistant's polling of temp files, and so don't have
good progress bars yet.
Still some work to do here (see progressbars.mdwn changes), but this
is entirely an improvement from the lack of progress bars for encrypted
downloads.
* since this is a crippled filesystem anyway, git-annex doesn't use
symlinks on it
* so there's no reason to use the mixed case hash directories that we're
stuck using to avoid breaking everyone's symlinks to the content
* so we can do what is already done for all bare repos, and make non-bare
repos on crippled filesystems use the all-lower case hash directories
* which are, happily, all 3 letters long, so they cannot conflict with
mixed case hash directories
* so I was able to 100% fix this and even resuming `git annex add` in the
test case will recover and it will all just work.
There was confusion in different parts of the progress bar code about
whether an update contained the total number of bytes transferred, or the
number of bytes transferred since the last update. One way this bug
showed up was progress bars that seemed to stick at zero for a long time.
In order to fix it comprehensively, I add a new BytesProcessed data type,
that is explicitly a total quantity of bytes, not a delta.
Note that this doesn't necessarily fix every problem with progress bars.
Particularly, buffering can now cause progress bars to seem to run ahead
of transfers, reaching 100% when data is still being uploaded.