Avoid a warning message when renameExport is not supported, and just
fallback to deleting with a subsequent re-upload. Especially needed for
importtree remotes, where renameExport needs to be disabled.
This changes the external special remote protocol, but in a
backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older
version of git-annex will cause it to make renameExport return False.
This works, and tested syncing both gets changes from a special remote
and sends changes to it, keeping it fully in sync nicely!
But have not tried it with a subdir configured.
This is not super efficient; it would be better to lock the database
once and build up a queue of changes and flush once.
But, storeExportWithContentIdentifier is likely going to be the really
expensive part, so let's do the simple thing and only optimise later if
needed.
untested
This won't be super slow, but it does need to diff two likely large
trees, and since the git-annex branch rarely sits still, it will most
likely be run at the beginning of every import.
A possible speed improvement would be to only run this when the database
did not contain a ContentIdentifier. But that would only speed up
imports when there is no new version of a file on the special remote,
at most renames of existing files being imported.
A better speed improvement would be to record something in the git-annex
branch that indicates when an import has been run, and only do the diff
if the git-annex branch has record of a newer import than we've seen
before. Then, it would only run when there is in fact new
ContentIdentifier information available from a remote. Certianly doable,
but didn't want to complicate things yet.
Note that I tried an evil remote that lists ImportLocations with
../../../ in them and indeed this resulted in git blowing up and the
import failing, and not writing outside the repo.
git-annex: thread blocked indefinitely in an STM transaction
failed
git-annex: sqlite query crashed
CallStack (from HasCallStack):
error, called at ./Database/Handle.hs:98:42 in main:Database.Handle
failed
This needs further investigation.
Had to add two more API calls to override export APIs that are not safe
for use in combination with import.
It's unfortunate that removeExportDirectory is documented to be allowed
to remove non-empty directories. I'm not entirely sure why it's that
way, my best guess is it was intended to make it easy to implement with
just rm -rf.
For now, it's only allowed when exporttree=yes is also set.
That simplified the implementation, but could later be changed if
there's a remote that makes sense to be an import but not an export.
However, it may work just as well to make a remote be readonly to
prevent export to it while still allowing import.
This does not avoid all possible races, but it does avoid all likely
ones, and is demonstratably better than git's own handling of races
where files get modified at the same time as it's updating the working
tree.
The main thing this won't detect are not unlikely races where part
of a file gets changed while it's being copied and then the file is
restored to its original condition before the modification check.
No, it's more likely that the limitations of checking inode, size,
and mtime won't detect certian modifications, involving eg mmapped
files.
The branch is only updated once the export is 100% complete. This way,
if an export is started but interrupted and so the remote does not yet
contain some of the files, an import will make a commit on the old
branch, and so won't delete the missing files.
Made some api changes.
listImportableContents needs to provide the size
of the data, so the downloader can check disk free space.
retrieveExportWithContentIdentifier is passed the filepath to write to
Use temporary "CID" key during download of a ContentIdentifier from a
remote, so withTmp can be used and then move the content to the real key
once it's known.
Added graftTree but it's buggy.
Should use graftTree in Annex.Branch.graftTreeish; it will be faster
than the current implementation there.
Started Annex.Import, but untested and it doesn't yet handle tree
grafting.
Avoid performing repository fixups for submodules and git-worktrees
when there's a .noannex file that will prevent git-annex from being
used in the repository.
This change is ok as long as the .noannex file is really going to prevent
git-annex from being used. But, init --force could override the file.
Which would result in the repo being initialized without the fixups
having run.
To avoid that situation decided to change init, to not let --force be used
to override a .noannex file. Instead the user can just delete the file.
* fromkey: Added --json.
* fromkey --batch output changed to support using it with --json.
The old output was not parseable for any useful information, so
this is not expected to break anything.
If the worktree file already exists, and is annexed and uses the same
key, avoid failing, nothing needs to be done.
Had to add lookupFileNotHidden to handle the case where an adjust --hide-missing
is in use, and the worktree file was hidden due to the object content
being missing. lookupFile would return the key of the hidden file,
but it makes sense that after fromkey succeeds, the worktree must
contain the file it was supposed to set up.
* Switch to using .git/annex/othertmp for tmp files other than partial
downloads, and make stale files left in that directory when git-annex
is interrupted be cleaned up promptly by subsequent git-annex processes.
* The .git/annex/misctmp directory is no longer used and git-annex will
delete anything lingering in there after it's 1 week old.
Also, in Annex.Ingest, made the filename it uses in the tmp dir be
prefixed with "ingest-" to avoid potentially using a filename used by
some other code.
* findref: Support file matching options: --include, --exclude,
--want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin
* Commands supporting --branch now apply file matching options --include,
--exclude, --want-get, --want-drop to filenames from the branch.
Previously, combining --branch with those would fail to match anything.
* add, import, findref: Support --time-limit.
This commit was sponsored by Jake Vosloo on Patreon.
Note that it does not prevent storing p2p access tokens or multicast
encryption keys, since those are not cached; the previous commit
established the distinction.
How well this works depends on how often getRemoteCredPair is called and
how expensive it is. In some cases setting this will result in an annoying
number of gpg password prompts and/or slowdowns due to reading creds
from the git-annex branch and decrypting, which could be improved by calling
getRemoteCredPair less often.
This commit was sponsored by Ilya Shlyakhter on Patreon.
info: When used with an exporttree remote, includes an "exportedtree" info,
which is the tree last exported to the remote. During an export conflict,
multiple values will be listed.
This commit was sponsored by John Pellman on Patreon.
Cache high-resolution mtimes for improved detection of modified files in v7
(and direct mode).
Including on Windows.
With back-compat support so old low-res mtimes won't break anything, and
so the new information also won't break old versions of git-annex.
I've seen intermittent failures of the test suite with v6 for a long time,
it seems to have possibly gotten worse with the changes around v7. Or just
being unlucky; all tests failed today.
Seen on amd64 and i386 builders, repeatedly but intermittently:
unused: FAIL (4.86s)
Test.hs:928:
git diff did not show changes to unlocked file
And I think other such failures, all involving v7/v6 mode tests.
I managed to reproduce the unused failure with --keep-failures,
and inside the repo, git diff was indeed not showing any changes for
the modified unlocked file.
The two stats will be the same other than mtime; the old and new files have
the same size and inode, since the test case writes to the file and then
overwrites it.
Indeed, notice the identical timestamps:
builder@orca:~/gitbuilder/build/.t/tmprepo335$ echo 1 > foo; stat foo; echo 2 > foo; stat foo
File: foo
Size: 2 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 3546179 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ builder) Gid: ( 1000/ builder)
Access: 2018-10-29 22:14:10.894942036 +0000
Modify: 2018-10-29 22:14:10.894942036 +0000
Change: 2018-10-29 22:14:10.894942036 +0000
Birth: -
File: foo
Size: 2 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 3546179 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ builder) Gid: ( 1000/ builder)
Access: 2018-10-29 22:14:10.894942036 +0000
Modify: 2018-10-29 22:14:10.898942036 +0000
Change: 2018-10-29 22:14:10.898942036 +0000
Birth: -
I'm seeing this in Linux VMs; it doesn't happen on my laptop. I've also
not experienced the intermittent test suite failures on my laptop.
So, I hope that this small delay will avoid the problem.
Update: I didn't, indeed I then reproduced the same failure on my
laptop, so it must be due to something else. But keeping this change anyway
since not needing to worry about lowish-resolution mtime in the test suite seems
worthwhile.
Removed undocumented special case in handling of a CHECKURL-MULTI response
with only a single file listed. Rather than ignoring the url that was in
the response, use it. This allows external special remotes that want to
provide some better url to do so, although I don't entirely agree with
using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an
undocumented special case that throws data away is just not a good idea.
This could in theory break some external special remote program that relied
on the current behavior, but its seems unlikely that it would because such
a program must already handle the multiple url case, unless it only ever
provides a single url response to CHECKURL-MULTI.
Make addurl --file work with a single item CHECKURL-MULTI response.
It already did for external special remotes due to the special case,
but now it also will for builtin ones like the BitTorrent special remote.
This commit was sponsored by Ilya Shlyakhter on Patron.