Allows using new special remote messages when git-annex supports them,
and avoiding using them when git-annex is too old. The new INFO is one
such message.
There's also the possibility, currently unused, for the special remote's
reply to include some kind of extensions of its own.
Merging this is blocked by https://github.com/datalad/datalad/issues/2124
since it seems it will break datalad. I checked all the other special
remotes and they will be ok.
This commit was supported by the NSF-funded DataLad project.
It's left up to the special remote to detect when git-annex is new enough
to support the message; an old git-annex will blow up.
This commit was supported by the NSF-funded DataLad project.
Not yet called by Command.Export.
WebDAV needs this to clean up empty collections. Also, example.sh turned
out to not be cleaning up directories when removing content
from them, so it made sense for it to use this.
Remote.Directory did not need it, and since its cleanup method for empty
directories is more efficient than what Command.Export will need to do
to find empty directories, it uses Nothing so that extra work can be
avoided.
This commit was sponsored by Thom May on Patreon.
The export database has writes made to it and then expects to read back
the same data immediately. But, the way that Database.Handle does
writes, in order to support multiple writers, makes that not work, due
to caching issues. This resulted in export re-uploading files it had
already successfully renamed into place.
Fixed by allowing databases to be opened in MultiWriter or SingleWriter
mode. The export database only needs to support a single writer; it does
not make sense for multiple exports to run at the same time to the same
special remote.
All other databases still use MultiWriter mode. And by inspection,
nothing else in git-annex seems to be relying on being able to
immediately query for changes that were just written to the database.
This commit was supported by the NSF-funded DataLad project.
This is seriously super hairy. It has to handle interrupted exports,
which may be resumed with the same or a different tree. It also has to
recover from export conflicts, which could cause the wrong content
to be renamed to a file.
I think this works, or is close to working. See the update to the design
for how it works.
This is definitely not optimal, in that it does more renames than are
necessary. It would probably be worth finding the keys that are really
renamed and only renaming those. But let's get the "simple" approach to
work first..
This commit was supported by the NSF-funded DataLad project.
Removed uncorrect UniqueKey key in db schema; a key can appear multiple
times with different files.
The database has to be flushed after each removal. But when adding files
to the export, lots of changes are able to be queued up w/o flushing.
So it's still fairly efficient.
If large removals of files from exports are too slow, an alternative
would be to make two passes over the diff, one pass queueing deletions
from the database, then a flush and the a second pass updating the
location log. But that would use more memory, and need to look up
exportKey twice per removed file, so I've avoided such optimisation yet.
This commit was supported by the NSF-funded DataLad project.
* Only export to remotes that were initialized to support it.
* Prevent storing key/value on export remotes.
* Prevent enabling exporttree=yes and encryption in the same remote.
SetupStage Enable was changed to take the old RemoteConfig.
This allowed only setting exporttree when initially setting up a
remote, and not configuring it later after stuff might already be stored
in the remote.
Went with =yes rather than =true for consistency with other parts of
git-annex. Changed docs accordingly.
This commit was supported by the NSF-funded DataLad project.
This will allow disabling exports for remotes that are not configured to
allow them. Also, exportSupported will be useful for the external
special remote to probe.
This commit was supported by the NSF-funded DataLad project
So it will be available later and elsewhere, even after GC.
I first though to use git update-index to do this, but feeding it a line
with a tree object seems to always cause it to generate a git subtree
merge. So, fell back to using the Git.Tree interface to maniupulate the
trees, and not involving the git-annex branch index file at all.
This commit was sponsored by Andreas Karlsson.
This avoids needing to deal with the complexity of partially transferred
files in the export. We'd not be able to resume uploading to such a file
anyway, so just avoid them.
The implementation in Remote.Directory is not completely ideal, because
it could leave the temp file hanging around in the export directory.
This only happens if it's killed with -9, or there's a power failure;
normally viaTmp cleans up after itself, even when interrupted. I could
not see a better way to do it though, since the export directory might
be the root of a filesystem.
Also some design thoughts on resuming, which depend on storeExport being
atomic.
This commit was sponsored by Fernando Jimenez on Partreon.
Make a pass over the whole exported tree, and upload anything that has
not yet reached the export. Update location log when exporting.
Note that the synthesized keys for non-annexed files are stored in the
location log too.
Some cases involving files in the tree with the same content are not
handled correctly yet.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
External special remotes will refuse to operate on keys with spaces in
their names. That has never worked correctly due to the design of the
external special remote protocol. Display an error message suggesting
migration.
Not super happy with this, but it's a pragmatic solution. Better than
complicating the external special remote interface and all external special
remotes.
Note that I only made it use SafeKey in Request, not Response. git-annex
does not construct a Response, so that would not add any safety. And
presumably, if git-annex avoids feeding any such keys to an external
special remote, it will never have a reason to make a Response using such a
key. If it did, it would result in a protocol error anyway.
There's still a Serializeable instance for Key; it's used by P2P.Protocol.
There, the Key is always in the final position, so it's ok if it contains
spaces.
Note that the protocol documentation has been fixed to say that the File
may contain spaces. One way that can happen, even though the Key can't,
is when using direct mode, and the work tree filename contains spaces.
When sending such a file to the external special remote the worktree
filename is used.
This commit was sponsored by Thom May on Patreon.
This makes the direct mode to v6 upgrade able to be performed in one clone
of a repository without affecting other clones, which can continue using v5
and direct mode.
Only reverse adjust the changes in the commit, which means that adjustments
do not need to be generally cleanly reversable.
For example, an adjustment can unlock all locked files, but does not need
to worry about files that were originally unlocked when reversing, because
it will only ever be run on files that have been changed. So, it's ok
if it locks all files when reversed, or even leaves all files as-is when
reversed.
There's a race here, but entering an adjusted branch for the first time is
not something to do when a commit is being made at the same time. Although,
may want to prevent the assistant from committing while entering the
adjusted branch.
Useful for things like ipfs that don't use regular urls.
An external special remote can add a regular url to a key, and then
git-annex get will download it from the web. But for ipfs, we want to
instead tell git-annex that the uri uses OtherDownloader. Before this
change, the external special remote protocol lacked a way to do that.
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.
Done as a background task while listening to episode 2 of the Type Theory
podcast.
I tend to prefer moving toward explicit exception handling, not away from
it, but in this case, I think there are good reasons to let checkPresent
throw exceptions:
1. They can all be caught in one place (Remote.hasKey), and we know
every possible exception is caught there now, which we didn't before.
2. It simplified the code of the Remotes. I think it makes sense for
Remotes to be able to be implemented without needing to worry about
catching exceptions inside them. (Mostly.)
3. Types.StoreRetrieve.Preparer can only work on things that return a
Bool, which all the other relevant remote methods already did.
I do not see a good way to generalize that type; my previous attempts
failed miserably.
bup already splits files and does rolling deltas, so there is no reason to
use chunking here.
The new API made it easier to add progress support for storeKey, so that's
done. Unfortunately, bup-split still outputs its own progress with -q,
so a little ugly, but not too bad.
Made dropping remove the branch for an object, for two reasons:
1. The new API calls removeKey to roll back a storeKey when the content
changed unexpectedly.
2. So that testremote will be happy.
Also, fixed a bug that caused a crash when removing the branch for an
object in rollback.
Slightly tricky as they are not normal UUIDBased logs, but are instead maps
from (uuid, chunksize) to chunkcount.
This commit was sponsored by Frank Thomas.
Added new fields for chunk number, and chunk size. These will not appear
in normal keys ever, but will be used for chunked data stored on special
remotes.
This commit was sponsored by Jouni K Seppanen.
When setting up a remote on a ssh server, prompt for a password inside the
webapp, rather than relying on ssh's own password prompting in the terminal
the webapp was started from, or ssh-askpass.
Avoids double prompting for the ssh password (and triple-prompting on
windows for rsync.net), since the entered password is cached for 10 minutes
and this cached password is reused when setting up the repository, after
the initial probe.
When the user has an existing ssh key set up, they can choose to use it,
rather than entering a password. The webapp used to probe for this case
automatically, so this is a little harder, but it's an advanced user thing.
Note that this commit is known to break enabling existing rsync
repositories. It hs not been tested with gcrypt repositories. It's not been
successfully tested yet on Windows.
This commit was sponsored by Ralph Mayer.
To use, set GIT_ANNEX_SSHASKPASS to point to a fifo or regular file
(FIFO is better, avoids touching disk or multiple readers) that contains
the password. Then set SSH_ASKPASS=git-annex, and when ssh runs it, it will
tell ssh the password.
This is not yet used..
This is a better approach to finding both when NM has lost a network
connection, and when a new network connection is made by NM.
Tested with network-manager 0.9.8.8.
This commit was sponsored by Cedric Staub.
* Remote system might be available, and connection get lost. Should
reconnect, but needs to avoid bad behavior (ie, constant reconnect
attempts.) Use exponential backoff.
* Detect if old system had a too old git-annex-shell, and show the user
a nice message in the webapp. Required parsing error messages, so perhaps
this code shoudl be removed once enough time has passed..
* Switch the protocol to using remote URI's, rather than remote names.
Names change. Also avoids issues with serialization of names containing
whitespace.
This is nearly ready for merge into master now. I'd still like to make the ssh
transport smarter about reusing ssh connection caching during git pull.
This commit was sponsored by Jim Paris.
a ssh remote, and pulls.
XMPP is no longer needed in this configuration!
Requires the remote server have git-annex-shell with notifychanges support.
(untested)
This commit was sponsored by Geog Wechslberger.
So far, handling connecting to git-annex-shell notifychanges, and
pulling immediately when a change is pushed to a remote.
A little bit buggy (crashes after the first pull), but it already works!
This commit was sponsored by Mark Sheppard.
This will be used by the remote-daemon to quickly tell when changes have
been pushed from some other repository into a ssh remote.
Adjusted the remote-daemon protocol to communicate changed shas, rather
than git branch refs. This way, it can easily check if a sha is new.
This commit was sponsored by Carlos Trijueque Albarran.