init: When --version=5 is passed on a crippled filesystem, use a v5 direct
mode repo as requested, rather than upgrading to v7 adjusted unlocked.
Fixed test suite on crippled filesystems, making it request --version=5
to test direct mode.
Deleting directories is one of the great unsolved problems of CS, thanks to
abominations like NFS lock files and Windows and races with other processes
cleaning up after themselves in the background. The gpg test harness
sometimes failed to delete its temp directory on NFS. Avoid the problem
class by not deleting it at all, and putting it inside the tmp repo being
tested. The test suite's more robust (and/or nonsensical) workarounds for
deleting its test dir will thus be used, hopefully avoiding the problem
until an OS finds a new way to violate POSIX and the laws of nature.
Note that this means that the .gnupg directory will be on whatever
filesystem the test suite is being run on, which may be a lesser quality
filesystem than gpg is really expecting. Gpg does not seem to need to
write sockets etc to there so this seems ok. The only known problem is
that if the filesystem forces a directory mode like 777, gpg will warn
about unsafe home directory perms, but it still works.
This fixes a bug with the numcopies counting when using sync --content.
It did not always pass the local repo uuid to handleDropsFrom, and so the
numcopies counting was off by one, and unwanted local content would only be
dropped when there were numcopies+1 remote copies.
Also, support dropping local content that has reached an
exporttree remote that is not untrusted (currently only S3 remotes
with versioning).
* Fix bug upgrading from direct mode to v7: when files in the repository
were already committed as v7 unlocked files elsewhere, and the
content was present in the direct mode repository, the annexed files
got their full content checked into git.
* Fix bug that caused v7 unlocked files in a direct mode repository
to get locked when committing.
This commit was sponsored by Nick Piper on Patreon.
When a file was already unlocked, but the annex object was present, the
upgrade process populated the unlocked file, but neglected to update the
index.
This commit was sponsored by Jochen Bartl on Patreon.
webdav: When initializing, avoid trying to make a directory at the top of
the webdav server, which could never accomplish anything and failed on
nextcloud servers. (Reversion introduced in version 6.20170925.)
This commit was sponsored by mo on patreon.
No deprecation warning at run time, just one on the man page.
One thing findref remains able to do that find cannot is to run in a bare
repo. Find was made to refuse to run in a bare repo because it seemed
confusing for it to not list any files ever in that situation. It would be
better for find --branch to work in a bare repo but not without --branch
but I don't currently have a way to do that.
Probably a better solution would be to make git-annex in a bare repo
default to --branch master or something like that instead of --all.
This commit was sponsored by Denis Dzyubenko on Patreon.
* findref: Support file matching options: --include, --exclude,
--want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin
* Commands supporting --branch now apply file matching options --include,
--exclude, --want-get, --want-drop to filenames from the branch.
Previously, combining --branch with those would fail to match anything.
* add, import, findref: Support --time-limit.
This commit was sponsored by Jake Vosloo on Patreon.
When public access is used for the remote, it complained that the user
needed to set creds to use it, which was just wrong.
When creds were being used, it fell back from trying to use the version ID
to just accessing the key in the bucket, which was ok for non-export
remotes, but wrong for buckets.
In both cases, display a hopefully useful warning.
This should only come up when an existing S3 remote has been exported
to, and then later versioning was enabled.
Note that it would perhaps be possible to fall back from trying to use
retrieveKeyFile when it fails and instead use retrieveKeyFileFromExport,
which may work when S3 version ID is missing. But there are problems
with that approach; how to tell when retrieveKeyFile has failed due to this
rather than a network problem etc? Anyway, that approach would only work
until the file in the export got overwritten, and then it would no
longer be accessible. And with versioning enabled, the user wants old
versions of objects to remain accessible, so it seems better to warn
about the problem as soon as possible, so they can go back and add S3
version IDs.
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
Note that it does not prevent storing p2p access tokens or multicast
encryption keys, since those are not cached; the previous commit
established the distinction.
How well this works depends on how often getRemoteCredPair is called and
how expensive it is. In some cases setting this will result in an annoying
number of gpg password prompts and/or slowdowns due to reading creds
from the git-annex branch and decrypting, which could be improved by calling
getRemoteCredPair less often.
This commit was sponsored by Ilya Shlyakhter on Patreon.
dropunused: When an unused object file has gotten modified, eg due to
annex.thin being set, don't silently skip it, but display a warning and let
--force drop it.
This commit was sponsored by Ethan Aubin.
info: When used with an exporttree remote, includes an "exportedtree" info,
which is the tree last exported to the remote. During an export conflict,
multiple values will be listed.
This commit was sponsored by John Pellman on Patreon.
* init: When a crippled filesystem causes an adjusted unlocked branch to
be used, set repo version to 7, which it neglected to do before.
* init: When on a crippled filesystem, and the git version is too old
to use an adjusted unlocked branch, fall back to using direct mode.
This commit was sponsored by Ilya Shlyakhter on Patreon.
Seems that youtube-dl --get-filename on a playlist lists all the filenames
for the playlist, which can take quite some time. The code already only
took the first name, so --no-playlist can speed it up a lot.
This commit was sponsored by Brett Eisenberg on Patreon.
And added stack-lts-9.9.yaml to support old versions of stack.
The i386 ancient autobuilder needs stack-lts-9.9.yaml; the OSX autobuilder
may also use it for a while, and it's needed to build on eg debian stable.
That didn't actually happen, newer lts like that one are not supported
by the version of stack in Debian stable, used for the i386-ancient
autobuild, and generally I want git-annex to be buildable on stable
releases of linux distros etc. So stack.yaml is going to be stuck on old
versions for some time until some years after stack stops breaking backwards
compatability.
When a command is operating on multiple files and there's an error with
one, try harder to continue to the rest. (As was already done for many
types of errors including IO errors.)
This handles cases like lockContentForRemoval throwing an exception when
the content is already locked. Just because a drop of one file fails, does
not mean it shouldn't go on to try to drop other files.
I looked over uses of `giveup` in Command/*; there are too many to check
them all extensively, but none stood out as being problems that should let
one commandAction stop running other commandActions. Worst case, something
bad will happen and rather than stopping right away with an error,
git-annex will display multiple errors as it fails over and over on each
file. I don't think I ever really intended `error`/`giveup` to stop other
commandActions; this was a relic of old confusion over haskell exception
handling.
Test suite passes.
This commit was sponsored by Ethan Aubin.
* drop -J: Avoid processing the same key twice at the same time when
multiple annexes files use it.
This prevents a drop of a key conflicting with another drop of the same
key.
This commit was sponsored by Brock Spratlen on Patreon.
export, sync --content: Avoid unnecessarily trying to upload files to an
exporttree remote that already contains the files.
When the export was origianly made in one repo and now git-annex is
running in a different repo, the export database is not yet populated with
information about the exportLocation of files. So, it was trying to upload
the files to the export, even when it already contained them.
sync --content would first download the content from the export, and then
re-upload the content back.
And this also led to "not available" failures for each file that was not
locally present yet.
Fix: Just use checkPresentExport before uploading; if it succeeds update
the database.
This is a surprising oversight, it's possible it fixes a reversion because
I would have thought I'd have noticed this problem when originally
developing exporttree remotes.
This commit was sponsored by Jochen Bartl on Patreon.
When an export conflict prevents accessing a special remote, be clearer
about what the problem is and how to resolve it.
This commit was sponsored by Trenton Cronholm on Patreon.
Don't much like that there's no way to distinguish between having the whole
content and having an old version of the file that's bigger, but of course
resuming a http transfer can always yield the wrong result if the file on
the http server is changing, and git-annex will detect that when it
verifies the downloaded content.
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
Fix bash completion of "git annex" to propertly handle files with spaces
and other problem characters. (Completion of "git-annex" already did.)
This commit was sponsored by Jake Vosloo on Patreon.
Finishes the start made in 983c9d5a53, by
handling the case where `transfer` fails for some other reason, and so the
ReadContent callback does not get run. I don't know of a case where
`transfer` does fail other than the locking dealt with in that commit, but
it's good to have a guarantee.
StoreContent and StoreContentTo had a similar problem.
Things like `getViaTmp` may decide not to run the transfer action.
And `transfer` could certianly fail, if another transfer of the same
object was in progress. (Or a different object when annex.pidlock is set.)
If the transfer action was not run, the content of the object would
not all get consumed, and so would get interpreted as protocol commands,
which would not go well.
My approach to fixing all of these things is to set a TVar only
once all the data in the transfer is known to have been read/written.
This way the internals of `transfer`, `getViaTmp` etc don't matter.
So in ReadContent, it checks if the transfer completed.
If not, as long as it didn't throw an exception, send empty and Invalid
data to the callback. On an exception the state of the protocol is unknown
so it has to raise ProtoFailureException and close the connection,
same as before.
In StoreContent, if the transfer did not complete
some portion of the DATA has been read, so the protocol is in an unknown
state and it has to close the conection as well.
(The ProtoFailureMessage used here matches the one in Annex.Transfer, which
is the most likely reason. Not ideal to duplicate it..)
StoreContent did not ever close the protocol connection before. So this is
a protocol change, but only in an exceptional circumstance, and it's not
going to break anything, because clients already need to deal with the
connection breaking at any point.
The way this new behavior looks (here origin has annex.pidlock = true so will
only accept one upload to it at a time):
git annex copy --to origin -J2
copy x (to origin...) ok
copy y (to origin...)
Lost connection (fd:25: hGetChar: end of file)
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
Fix hang when transferring the same objects to two different clients at the
same time. (Or when annex.pidlock is used, two different objects to the
same or different clients.)
Could also potentially occur if a client was downloading an object and
somehow lost connection but that git-annex-shell was still running and
holding the transfer lock.
This does not guarantee that, if `transfer` fails for some other reason,
a DATA response will be made.
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
Not the first time this kind of test suite breakage has happened..
It would be good to avoid somehow it looking up from .t and finding a git
repo. But just running the test suite from time to time outside of
git-annex would also let me notice these before the distribution packagers
do.
This commit was sponsored by mo on Patreon.
That can leave other imported files not checked into git, because the git
command queue is not flushed when git-annex errors out. And since it only
happens once git-annex has concluded a feed is broken, it's an intermittent
bug, worst kind. Been seeing it for a while, only tracked down today.
Instead, by returning False, git-annex importfeed will cleanly shutdown and
still exit nonzero.
This commit was sponsored by Denis Dzyubenko on Patreon.
When readContent got Nothing from prepSendAnnex, it did not run its
callback, and the callback is what sends the DATA reply.
sendContent checks with contentSize that the object file is present, but
that doesn't really guarantee that prepSendAnnex won't return Nothing.
So, it was possible for a P2P protocol GET to not receive a response,
and appear to hang. When what it's really doing is waiting for the next
protocol command.
This seems most likely to happen when the annex is in direct mode, and the
file being requested has been modified. It could also happen in an indirect
mode repository if genInodeCache somehow failed. Perhaps due to a race
with a drop of the content file.
Fixed by making readContent behave the way its spec said it should,
and run the callback with L.empty in this case.
Note that, it's finee for readContent to send any amount of data
to the callback, including L.empty. sendBytes deals with that
by making sure it sends exactly the specified number of bytes,
aborting the protocol if it's too short. So, when L.empty is sent,
the protocol will end up aborting.
This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
Cache high-resolution mtimes for improved detection of modified files in v7
(and direct mode).
Including on Windows.
With back-compat support so old low-res mtimes won't break anything, and
so the new information also won't break old versions of git-annex.
Removed undocumented special case in handling of a CHECKURL-MULTI response
with only a single file listed. Rather than ignoring the url that was in
the response, use it. This allows external special remotes that want to
provide some better url to do so, although I don't entirely agree with
using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an
undocumented special case that throws data away is just not a good idea.
This could in theory break some external special remote program that relied
on the current behavior, but its seems unlikely that it would because such
a program must already handle the multiple url case, unless it only ever
provides a single url response to CHECKURL-MULTI.
Make addurl --file work with a single item CHECKURL-MULTI response.
It already did for external special remotes due to the special case,
but now it also will for builtin ones like the BitTorrent special remote.
This commit was sponsored by Ilya Shlyakhter on Patron.
This is safe, because while the annex object ends up executable,
there were already at least two other cases where it ended up executable:
1. git add an an executable file
2. chmod +x of a a non-executable worktree file that was hard linked to the
annex object
After copy/hard link, it always fixes up the permissions to match the mode
of the worktree file, so when an executable annex object gets hard linked
to a non-executable worktree file, its execute bit gets removed.
Commit b7c8bf5274 already *said* it would do
this; I suspect the line of code I've removed was included in that commit
accidentially.
Also improves annex.thin documentation.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
This makes --version=6 still work, despite v6 not being in
supportedVersions. Which is useful for scripts that use it.
I didn't document it on the man page, because it's indistinguishable
from an automatic upgrade after initting as v6.