unannex, uninit: When an annexed file is modified, don't overwrite the
modified version with an older version from the annex
This commit was sponsored by Mark Reidenbach on Patreon.
When a git remote is configured with an absolute path, use that path,
rather than making it relative. If it's configured with a relative path,
use that.
Git.Construct.fromPath changed to preserve the path as-is,
rather than making it absolute. And Annex.new changed to not
convert the path to relative. Instead, Git.CurrentRepo.get
generates a relative path.
A few things that used fromAbsPath unncessarily were changed in passing to
use fromPath instead. I'm seeing fromAbsPath as a security check,
while before it was being used in some cases when the path was
known absolute already. It may be that fromAbsPath is not really needed,
but only git-annex-shell uses it now, and I'm not 100% sure that there's
not some input that would cause a relative path to be used, opening a
security hole, without the security check. So left it as-is.
Test suite passes and strace shows the configured remote url is used
unchanged in the path into it. I can't be 100% sure there's not some code
somewhere that takes an absolute path to the repo and converts it to
relative and uses it, but it seems pretty unlikely that the code paths used
for a git remote would call such code. One place I know of is gitAnnexLink,
but I'm pretty sure that git remotes never deal with annex symlinks. If
that did get called, it generates a path relative to cwd, which would have
been wrong before this change as well, when operating on a remote.
I suspect this is a bug in cabal sdist, because with
Includes: Utility/libkqueue.h
the file is not included, but putting it in extra-files does
get it into the tarball.
Fix an oddity in matching options and preferred content expressions such as
"foo (bar or baz)", which was incorrectly handled as if it were "(foo or
bar) and baz)" rather than the intended "foo and (bar or baz)"
Seemed like a change to consume should be able to handle this case
better, but I was having trouble writing it that way, so instead added
a separate pass that inserts the implicit ands explicitly. Also added
several test cases to make sure versions with and without explicit ands
generate the same.
Missed this when implementing it because of the default case catching
the new constructor. So, removed that default case to make sure
future types of adjusted branches don't make the same mistake.
Complicated by git-annex addurl --fast which adds the file whose content
is not present, so it needs to stay unlocked when on such a branch.
This commit was sponsored by Brock Spratlen on Patreon.
Fixed that, and made parserLsTree accept the space as well as tab.
Fixes a reversion that made import of a tree from a special remote result in
a merge that deleted files that were not preferred content of that special
remote.
Avoids the smudge --clean filter failing because URL keys do not support
genKey. Instead the modified content will be added using the default
backend.
This commit was sponsored by Jochen Bartl on Patreon.
Don't accept the cid of the temp file that the content has just been
written to as something we will accept if another file has that same
content. There's no reason to, and on FAT, due to mtime resolution,
the test suite hit just such a case.
This fixes a reversion from 73df633a62
which removed inode from the ContentIdentifier.
Seems that dropDrive on windows only drops eg c:/ but not a leading /
while on linux, it does drop a leading / (which is what it considers
to be equivilant to a drive letter. I had been relying on it to drop
both. So need to drop leading directory separators.
Also, if the quickcheck generated input is eg "c:c:c:c:foo",
dropDrive will only drop the first one, leaving a path that's
still not relative. So instead of using dropDrive, just remove the
colons from the path.
This is probably a reversion, but not sure what caused it. By the time
Annex.Init runs fixupUnusualReposAfterInit, another git-annex process has
at least sometimes already done the necessary fixups. (Eg, one run
indirectly by a git command.) But since the Repo is cached, it doesn't
realize and does them again. So, avoid crashing when git config --unset
fails.
This commit was sponsored by Jack Hill on Patreon.
Directory special remotes with importtree=yes now avoid unncessary overhead
when inodes of files have changed, as happens whenever a FAT filesystem
gets remounted.
A few unusual edge cases of modifications won't be detected and
imported. I think they're unusual enough not to be a concern. It would
be possible to add a config setting that controls whether to compare
inodes too, but does not seem worth bothering the user about currently.
I chose to continue to use the InodeCache serialization, just with the
inode zeroed. This way, if I later change my mind or make it
configurable, can parse it back to an InodeCache and operate on it. The
overhead of storing a 0 in the content identifier log seems worth it.
There is a one-time cost to this change; all directory special remotes
with importtree=yes will re-hash all files once, and will update the
content identifier logs with zeroed inodes.
This commit was sponsored by Brett Eisenberg on Patreon.
Including the non-standard URI form that git-remote-gcrypt uses for rsync.
Eg, "ook://foo:bar" cannot be parsed because "bar" is not a valid port
number. But git could have a remote with that, it would try to run
git-remote-ook to handle it. So, git-annex has to allow for such things,
rather than crashing.
This commit was sponsored by Luke Shumaker on Patreon.
It was just slapping on a path separator to the front of the path to
make it absolute, but on windows, a path like "//foo/bar" actually
has a network "drive" of "//foo" and so that broke the test case.
Since "a:foo" is a somehow relative path on windows
(who knows how), drop any drive from the input. But dropDrive also drops
any leading path separator, making the input path relative. So now
it should be safe to slapp on a leading path separator.
This was not a good test, it broke the requirement that
relPathDirToFileAbs take absolute paths. And it failed when the two
input paths were eg, the same but differently normalized.
Replaced with some tests of the real basics of that function.
It got broken in several ways by the streaming seeking optimisations
around version 8.20201007.
Moved time limit checking out of the matcher, which was a hack in the
first place. So everywhere that uses Limit.getMatcher needs to check
time limit. Well, almost everywhere. Command.Info uses it, but it does
not make sense to time limit getting info. And Command.MultiCast uses it
just to build up a list of files that then get passed to a command, so
it would never have hit the timeout in a useful way.
This implementation is a little more expensive when at time limit than
necessary, since it continues seeking only to discard everything after the
time limit. I did try making it close the file handles to force a faster
shutdown, but that didn't work and hung. Could certianly be improved
somehow, but seeking is probably not the expensive bit when a time limit
is hit, so this seems acceptable for now.
For reasons explained in the bug report.
Implemented using a persistent migration, which works fine. It may add a
little startup overhead when a remote is enabled that uses this, but
probably un-noticable.
On the next major version, it would be fine to delete this database,
and regenerate it from the git-annex branch information. Then this
change could be reverted.
Did nothing about adding back the data that got dropped from the db
due to the bug. Only the borg special remote was probably affected,
and it's not been released yet. rm -rf .git/annex/cidsdb does work.
And vice-versa, but it's better to use '/' for portability.
Notably, standardPreferredContent contains "archive/*" and that might not
match if the filename ends up coming in with the slashes the other way
around.
I do think this was a reversion, but I have not tracked back to what
version. While involving the remote config, it's not the same class of
problems that I kept having to chase down for a while after the remote
config parser reworking.
MatchingKey is not the thing to use when matching on actual worktreee
files.
Fix reversion in 8.20201116 that made include= and exclude= in
preferred/required content expressions match a path relative to the current
directory, rather than the path from the top of the repository.
Avoid spurious "verification of content failed" message when downloading
content from a ssh or tor remote fails due to the remote no longer having a
copy of the content.
The P2P protocol already handled this case by sending DATA 0, followed by
VALID. But VALID was not really right, because the data is not the
requested data. So, send DATA 0, followed by INVALID. Old versions of
git-annex handle INVALID the same as VALID in this case. Now new versions
avoid displaying an incorrect message.
It would be better for the P2P protocol to have a different way to indicate
this, like perhaps sending INVALID without DATA. But that would be a
breaking change and need a new protocol verison. Since INVALID already is
part of the protocol and already needs to be handled, using it for this
special case too seems ok, and avoids the complication of another protocol
version.
This commit was sponsored by Jochen Bartl on Patreon.
This is an edge case, which happened to be triggered by the P2P protocol
seeing DATA 0. When reading 0 bytes, getting an empty string does
not mean the handle has reached EOF.
I verified there was in fact a bug, where get of an empty file followed
by another file would get the empty file and then fail
with "handle is closed". This fixes it.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
Reversion introduced in version 8.20201007, one release after the 1st
release with the extension.
Surprisingly, hClose can hang if another thread is reading from the
handle. This is because it uses takeMVar.
The use of cancel here does mean that, if receiveMessageAddonProcess
or Remote.External.AsyncExtension.receiveloop allocated some resource in
a non-async-exception safe way, they might not get a chance to clean it up.
They do not appear to, and anyway, this only happens when git-annex is
shutting down, so any recource that did leak would not be a problem.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.