Commit graph

2753 commits

Author SHA1 Message Date
Joey Hess
9f4ffe65e9
implement removeExportDirectory
Not yet called by Command.Export.

WebDAV needs this to clean up empty collections. Also, example.sh turned
out to not be cleaning up directories when removing content
from them, so it made sense for it to use this.

Remote.Directory did not need it, and since its cleanup method for empty
directories is more efficient than what Command.Export will need to do
to find empty directories, it uses Nothing so that extra work can be
avoided.

This commit was sponsored by Thom May on Patreon.
2017-09-15 13:18:21 -04:00
Joey Hess
a1b195d84c
External special remote protocol extended to support export.
Also updated example.sh to support export.

This commit was supported by the NSF-funded DataLad project.
2017-09-08 14:24:05 -04:00
Joey Hess
45d30820ac
document new stuff for external special remotes
Got rid of RENAMEEXPORT-UNSUPPORTED, no reason not to use
RENAMEEXPORT-FAILURE for that.

This commit was supported by the NSF-funded DataLad project.
2017-09-07 12:59:35 -04:00
Joey Hess
6ab14710fc
fix consistency bug reading from export database
The export database has writes made to it and then expects to read back
the same data immediately. But, the way that Database.Handle does
writes, in order to support multiple writers, makes that not work, due
to caching issues. This resulted in export re-uploading files it had
already successfully renamed into place.

Fixed by allowing databases to be opened in MultiWriter or SingleWriter
mode. The export database only needs to support a single writer; it does
not make sense for multiple exports to run at the same time to the same
special remote.

All other databases still use MultiWriter mode. And by inspection,
nothing else in git-annex seems to be relying on being able to
immediately query for changes that were just written to the database.

This commit was supported by the NSF-funded DataLad project.
2017-09-06 17:19:07 -04:00
Joey Hess
cae3704a44
export file renaming
This is seriously super hairy. It has to handle interrupted exports,
which may be resumed with the same or a different tree. It also has to
recover from export conflicts, which could cause the wrong content
to be renamed to a file.

I think this works, or is close to working. See the update to the design
for how it works.

This is definitely not optimal, in that it does more renames than are
necessary. It would probably be worth finding the keys that are really
renamed and only renaming those. But let's get the "simple" approach to
work first..

This commit was supported by the NSF-funded DataLad project.
2017-09-06 15:44:10 -04:00
Joey Hess
1ec3a9eb05
thoughts on handling renames efficiently
This gets complicated, but I think this design will work!

This commit was supported by the NSF-funded DataLad project.
2017-09-06 13:04:09 -04:00
Joey Hess
4da763439b
use export db to correctly handle duplicate files
Removed uncorrect UniqueKey key in db schema; a key can appear multiple
times with different files.

The database has to be flushed after each removal. But when adding files
to the export, lots of changes are able to be queued up w/o flushing.
So it's still fairly efficient.

If large removals of files from exports are too slow, an alternative
would be to make two passes over the diff, one pass queueing deletions
from the database, then a flush and the a second pass updating the
location log. But that would use more memory, and need to look up
exportKey twice per removed file, so I've avoided such optimisation yet.

This commit was supported by the NSF-funded DataLad project.
2017-09-04 14:39:32 -04:00
Joey Hess
28e2cad849
implement exporttree=yes configuration
* Only export to remotes that were initialized to support it.
* Prevent storing key/value on export remotes.
* Prevent enabling exporttree=yes and encryption in the same remote.

SetupStage Enable was changed to take the old RemoteConfig.
This allowed only setting exporttree when initially setting up a
remote, and not configuring it later after stuff might already be stored
in the remote.

Went with =yes rather than =true for consistency with other parts of
git-annex. Changed docs accordingly.

This commit was supported by the NSF-funded DataLad project.
2017-09-04 13:09:38 -04:00
Joey Hess
a4328b49d2
refactor ExportActions
This will allow disabling exports for remotes that are not configured to
allow them. Also, exportSupported will be useful for the external
special remote to probe.

This commit was supported by the NSF-funded DataLad project
2017-09-01 13:05:09 -04:00
Joey Hess
5483ea90ec
graft exported tree into git-annex branch
So it will be available later and elsewhere, even after GC.

I first though to use git update-index to do this, but feeding it a line
with a tree object seems to always cause it to generate a git subtree
merge. So, fell back to using the Git.Tree interface to maniupulate the
trees, and not involving the git-annex branch index file at all.

This commit was sponsored by Andreas Karlsson.
2017-08-31 18:06:49 -04:00
Joey Hess
bb08b1abd2
make storeExport atomic
This avoids needing to deal with the complexity of partially transferred
files in the export. We'd not be able to resume uploading to such a file
anyway, so just avoid them.

The implementation in Remote.Directory is not completely ideal, because
it could leave the temp file hanging around in the export directory.
This only happens if it's killed with -9, or there's a power failure;
normally viaTmp cleans up after itself, even when interrupted. I could
not see a better way to do it though, since the export directory might
be the root of a filesystem.

Also some design thoughts on resuming, which depend on storeExport being
atomic.

This commit was sponsored by Fernando Jimenez on Partreon.
2017-08-31 14:24:32 -04:00
Joey Hess
7c7af82578
resuming exports
Make a pass over the whole exported tree, and upload anything that has
not yet reached the export. Update location log when exporting.

Note that the synthesized keys for non-annexed files are stored in the
location log too.

Some cases involving files in the tree with the same content are not
handled correctly yet.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2017-08-31 13:33:50 -04:00
Joey Hess
bdec46ac13
a few tweaks to the design 2017-08-30 13:14:05 -04:00
Joey Hess
74aa4c503b
devblog 2017-08-29 17:26:42 -04:00
Joey Hess
6ae9d8fe49
simplify
Key is needed to use in reply
2017-08-28 15:37:34 -04:00
Joey Hess
ed5d8ee9ea
update proposed external special remote protocol 2017-08-28 15:34:26 -04:00
Joey Hess
792e582a60
fix link 2017-08-28 15:07:23 -04:00
Joey Hess
92ec2d13b5
formatting 2017-08-28 15:07:19 -04:00
Joey Hess
8cad03d7ca
typo 2017-08-28 15:04:25 -04:00
Joey Hess
dafafad115
external: nice error message for keys with spaces in their name
External special remotes will refuse to operate on keys with spaces in
their names. That has never worked correctly due to the design of the
external special remote protocol. Display an error message suggesting
migration.

Not super happy with this, but it's a pragmatic solution. Better than
complicating the external special remote interface and all external special
remotes.

Note that I only made it use SafeKey in Request, not Response. git-annex
does not construct a Response, so that would not add any safety. And
presumably, if git-annex avoids feeding any such keys to an external
special remote, it will never have a reason to make a Response using such a
key. If it did, it would result in a protocol error anyway.

There's still a Serializeable instance for Key; it's used by P2P.Protocol.
There, the Key is always in the final position, so it's ok if it contains
spaces.

Note that the protocol documentation has been fixed to say that the File
may contain spaces. One way that can happen, even though the Key can't,
is when using direct mode, and the work tree filename contains spaces.
When sending such a file to the external special remote the worktree
filename is used.

This commit was sponsored by Thom May on Patreon.
2017-08-17 16:18:34 -04:00
Joey Hess
73d04d5565
responses, bug I noticed 2017-08-15 14:42:22 -04:00
xloem
ff6f9e203e Added a comment: Git History 2017-08-10 00:25:27 +00:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
d5db7b4289 removed 2017-07-17 21:01:39 +00:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
c58b508018 Added a comment: Google Drive and Archive.org 2017-07-17 21:00:40 +00:00
yarikoptic
56f92354c8 Added a comment: export "each revision" -- thinking about quiltdata 2017-07-14 20:10:42 +00:00
yarikoptic
d090b8114e Added a comment: comments on protocol 2017-07-12 22:09:55 +00:00
yarikoptic
a858f8f8e9 Added a comment: regarding setting a URL by custom special remote 2017-07-12 22:04:39 +00:00
yarikoptic
0bba9f084f Added a comment: side-note about WebDAV&DeltaV 2017-07-12 21:54:50 +00:00
Joey Hess
b294c00e16
comment 2017-07-12 14:19:26 -04:00
yarikoptic
f4481ee748 Added a comment: special remotes with versioning support 2017-07-12 17:30:33 +00:00
Joey Hess
5342c5064b
comment 2017-07-12 12:56:13 -04:00
Joey Hess
ecad2da8c5
Merge branch 'master' of ssh://git-annex.branchable.com 2017-07-12 12:44:21 -04:00
Joey Hess
aa7cc67a3d
protocol design 2017-07-12 12:43:46 -04:00
yarikoptic
215e1420f2 Added a comment: does it really need to be a new command ("export") or could be the same old "copy"? 2017-07-11 22:14:39 +00:00
yarikoptic
37a6bef639 Added a comment: couldn't STATE be used for KEY -> FILENAME(s) mapping? 2017-07-11 22:05:49 +00:00
yarikoptic
a75aa38ca2 Added a comment: note that some remotes could support files versioning "natively" 2017-07-11 21:59:49 +00:00
Joey Hess
905b1108b7
improve 2017-07-11 16:31:30 -04:00
Joey Hess
adbd0ff068
add design 2017-07-11 11:32:35 -04:00
Edward Betts
0750913136
correct spelling mistakes 2017-02-12 17:30:23 -04:00
Joey Hess
76d525c4d5
update links to wormhole issues 2016-12-16 15:03:20 -04:00
Joey Hess
a12eac060c
updates 2016-12-13 14:35:58 -04:00
Joey Hess
7c245b2180
update 2016-12-07 12:48:24 -04:00
Joey Hess
60f4b1cf36
PAKE 2016-12-06 16:55:53 -04:00
Joey Hess
67c1e87f05
local lan detection 2016-11-14 18:37:56 -04:00
Joey Hess
a7fd200440
updated design
more details on using tor and pairing
2016-11-14 12:10:09 -04:00
2a01:cb04:422:dd00:75bc:9129:cb49:31be
39a7045a81 poll vote (OpenStack SWIFT) 2016-10-04 11:57:43 +00:00
Joey Hess
3416cd8148
remove incorrect bit about multiple concurrent transfers, and improve description of protocol flow 2016-09-26 19:19:32 -04:00
zack
58f2276dc6 Added a comment: adjusted branche to "focus" on a specific subtree 2016-08-22 14:19:57 +00:00
anarcat
105f07090b magic wormhole seems like a nice alternative for arbitrary data sharing here 2016-07-26 01:33:15 +00:00
Joey Hess
0fdbf639dc
followup; open bug 2016-07-19 15:04:41 -04:00