Only fsck and reinject and the test suite used the Backend, and they can
look it up as needed from the Key. This simplifies the code and also speeds
it up.
There is a small behavior change here. Before, all commands would warn when
acting on an annexed file with an unknown backend. Now, only fsck and
reinject show that warning.
For sync, saves 1 ssh connection per remote. For remotedaemon, the same
ssh connection that is already open to run git-annex-shell notifychanges
is reused to pull from the remote.
Only potential problem is that this also enables connection caching
when the assistant syncs with a ssh remote. Including the sync it does
when a network connection has just come up. In that case, cached ssh
connections are likely to be stale, and so using them would hang.
Until I'm sure such problems have been dealt with, this commit needs to
stay on the remotecontrol branch, and not be merged to master.
This commit was sponsored by Alexandre Dupas.
Motivation: Hook scripts for nautilus or other file managers
need to provide the user with feedback that a file is being downloaded.
This commit was sponsored by THM Schoemaker.
Note that this is a nearly entirely free feature. The data was already
stored in the metadata log in an easily accessible way, and already was
parsed to a time when parsing the log. The generation of the metadata
fields may even be done lazily, although probably not entirely (the map
has to be evaulated to when queried).
For example "standard or (include=otherdir/*)" or even "not standard"
Note that the implementation avoids any potential for loops (if a
standard preferred content expression itself mentioned standard).
This commit was sponsored by Jochen Bartl.
Old ssh did not check the hostname passed to -O stop, so I had used "any".
But now ssh does check it! I think this happened as part of the client-side
hostname canonicalization changes in 6.5p1, but have not verified that
introduced the problem.
The symptom was that it would try to dns lookup "any", which often caused a
bit of a delay at shutdown. And the old ssh connection kept running, so
it would do it over and over again.
Fixed by using localhost, which hopefully reliably resolves to some address
that ssh will accept.. Also nukeFile the socket after ssh has been asked to
shutdown, just in case.
unused: In direct mode, files that are deleted from the work tree are no longer incorrectly detected as unused.
Direct mode `git annex info` slows down a bit due to more stringent
checking, but not by a lot.
This is a new feature, it was not handled before, since it's a bit of an
edge case. However, it can be handled exactly the same as a file/dir
conflict, just leave the non-annexed item alone.
While implementing this, the core resolveMerge' function got a lot simpler
and clearer. Note especially that where before there was an asymetric call to
stagefromdirectmergedir, now graftin is called symmetrically in both cases.
And, in order to add that `graftin us`, the current branch needed to be
known (if there is no current branch, there cannot be a merge conflict).
This led to some cleanups of how autoMergeFrom behaved when there is no
current branch.
This commit was sponsored by Philippe Gauthier.
Added test cases for both ways this can happen, with a conflict involving a
file, or a directory.
Cleaned up resolveMerge to not touch the work tree in direct mode, which
turned out to be the only way to handle things.. And makes it much nicer.
Still need to run test suite on windows.
In the case of a conflicted merge where the remote adds a directory, and we
have a file (which is checked in), resolveMerge' will create the link,
and so the fix for 1192d98721 looked at that,
thought it was an unannexed file (it's not in the oldref), and preserved
it.
This is a hacky fix. It would be better for resolveMerge' to not update the
work tree, at least in direct mode, and only stage the changes, which
mergeDirectCleanUp could then move into tree. I want to make that change,
but this is not the time to do it.
Using the extract(1) program to do the heavy lifting.
Decided to make git-annex run pre-commit-annex when committing. Since
git-annex pre-commit also runs it, it'll be run when git commit is run too,
via the pre-commit hook. This basically gives back the pre-commit hook
that git-annex took away. The implementation avoids repeatedly looking
for the hook script when the assistant is running and committing
repeatedly; only checks if the hook is available once.
To make the script simpler, made git-annex metadata -s field?=value
only set a field when it's not already got a value.
This commit was sponsored by bak.
Note that negated globs are not supported. Would have complicated the code
to add them, without changing the data type serialization in a
non-backwards-compatable way.
This commit was sponsored by Denver Gingerich.
This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO
of temp object files is too annoying (and if you don't want to keep
partially transferred objects across reboots).
.git/annex/misctmp must be on the same filesystem as the git work tree,
since files are moved to there in a way that will not work cross-device,
as well as symlinked into there.
I first wanted to put the tmp objects in .git/annex/objects/tmp, but
that would pose transition problems on upgrade when partially transferred
objects existed.
git annex info does not currently show the size of .git/annex/misctemp,
since it should stay small. It would also be ok to make something clean it
out, periodically.
Performance impact: When adding a large tree of new files, this needs
to do some git cat-file queries to check if any of the files already
existed and might need a metadata copy. I tried a benchmark in a copy
of my sound repository (so there was already a significant git tree
to check against.
Adding 10000 small files, with a cold cache:
before: 1m48.539s
after: 1m52.791s
So, impact is 0.0004 seconds per file added. Which seems acceptable, so did
not add some kind of configuration to enable/disable this.
This commit was sponsored by Lisa Feilen.
When constructing views, metadata is available about the location of the
file in the view's reference branch. Allows incorporating parts of the
directory hierarchy in a view.
For example `git annex view tag=* podcasts/=*` makes a view in the form
tag/showname.
Performance impact: I benchmarked git annex view tag=* in the conference
proceedings repo to take 6.459s before this change, and 6.544s after.
FWIW, I considered making the syntax for this be podcasts/*, which might
be easier for the user to learn. However, I think it's not as good:
* The user has to then juggle two different syntaxes, and podcasts/* will
be expanded by the shell so they also need to quote it, while podcasts/=*
is unlikely to be expanded by the shell.
* It would allow for things like podcasts/*/* and *.mp3 which do not
map well into views.
This commit was sponsored by Aurélien Pinceaux.