todo work

This commit is contained in:
Joey Hess 2023-06-23 13:47:01 -04:00
parent 2f3a275b58
commit 867a833624
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 36 additions and 7 deletions

View file

@ -5,14 +5,15 @@
content="""
`git-annex import --from remote` has recently been sped up a lot,
and the plan is to [[todo/remove_legacy_import_directory_interface]]
in favor of it.
in favor of it or reimplement the lecacy interface on top of it.
I think this would work as a faster alternative to --clean-duplicates,
using a directory special remote:
Using `git-annex import --from remote --fast`, when there's a huge file in
the directory remote, will hash it, but only once. On subsequent runs it
will recognise the file it has seen before.
So all that's needed to emulate --clean-duplicates is a way to do this:
git-annex import --from remote --fast
git-annex move --from remote --copies 2
When there's a huge file in the directory remote, it will hash it, but only
once. On subsequent runs it will recognise the file it has seen before.
Which doesn't work currently, but see [[drop_from_export_remote]].
"""]]

View file

@ -0,0 +1,21 @@
It should be possible for this to work:
joey@darkstar:~/tmp/bench3/a>git annex move x --from d
move x (from d...)
dropping content from an export is not supported; use `git annex export` to export a tree that lacks the files you want to remove
failed
git-annex could just alter the tree exported to the remote to remove the file.
It might be a little slow to do that for a lot of files, and it would create
some unattached tree objects that would linger until gc.
A simple optimisation would be to remove the file (with removeExport) but not update the
export tree for one file. Keep a log of removed files, and at the end of the command,
or some future point where the export tree is used, update the export tree to remove the
files from it.
This seems like a prerequisite for [[remove_legacy_import_directory_interface]] because
`git-annex import --deduplicate` and `--clean-duplicates` need to remove individual files
from the remote.
[[!confirmed]]

View file

@ -33,6 +33,13 @@ Another likely pain point is ad-hoc importing of individual files or
files matched by wildcard. The new interface is much more about importing
whole trees, perhaps configured by preferred content settings
> This is now addressed; `--fast` import from directory special remotes
> followed by `git-annex get` of the files that are wanted. --[[Joey]]
Another pain point is that to remove files from an export,
the user has to create trees that lack the files they want to remove.
[[drop_from_export_remote]] will resolve that.
One approach would be to make the old interface be implemented using the
new interface, and paper over the cracks, by eg setting up a directory
special remote automatically.
@ -46,4 +53,4 @@ interface.
--[[Joey]]
[[!tag needsthought]]
[[!tag confirmed]]