Merge branch 'borg'

This commit is contained in:
Joey Hess 2020-12-22 16:19:32 -04:00
commit 310d3c3823
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
43 changed files with 782 additions and 105 deletions

View file

@ -167,7 +167,9 @@ support a request, it can reply with `UNSUPPORTED-REQUEST`.
this can be used to list those versions. It opens a new
block of responses. This can be repeated any number of times
(indicating a branching history), and histories can also
be nested multiple levels deep.
be nested multiple levels deep.
This should only be used when the remote supports using
"TRANSFER RECEIVE Key" to retrieve historical versions of files.
* `END`
Indicates the end of a block of responses.
* `LOCATION Name`

View file

@ -0,0 +1,17 @@
Finally gotten started on the borg special remote idea. A prerequisite of
that is remotes that can be imported from, but not exported to. So I
actually started by allowing setting importtree=yes without
exporttree=yes. A lot of code had assumptions about that not being allowed,
so it took a while to chase down everything. Finished most of that yesterday.
What I've done today is added a `thirdPartyPopulated` type of remote,
which `git-annex sync` can "pull" from by using the existing import
interface to list files on it, and determine which of them are annex object
files. I have not started on the actual borg remote at all, but this should
be all the groundwork for it done.
(I also finished up annex.stalldetection earlier this week.)
---
This work was sponsored by Jake Vosloo [on Patreon](https://patreon.com/joeyh).

View file

@ -1541,6 +1541,12 @@ Remotes are configured using these settings in `.git/config`.
the location of the bup repository to use. Normally this is automatically
set up by `git annex initremote`, but you can change it if needed.
* `remote.<name>.annex-borgrepo`
Used by borg special remotes, this configures
the location of the borg repository to use. Normally this is automatically
set up by `git annex initremote`, but you can change it if needed.
* `remote.<name>.annex-ddarrepo`
Used by ddar special remotes, this configures

View file

@ -25,6 +25,7 @@ the git history is not stored in them.
* [[webdav]]
* [[git]]
* [[httpalso]]
* [[borg]]
* [[xmpp]]
The above special remotes are built into git-annex, and can be used

View file

@ -0,0 +1,32 @@
This special remote type accesses annexed files stored in a
[borg](https://www.borgbackup.org/) repository.
Unlike most special remotes, git-annex cannot be used to store annexed
files in this special remote. You store files by using borg as usual, to
back up the git-annex repository. Then `git-annex sync` will learn about
the annexed files that are stored in the borg repository.
## configuration
These parameters can be passed to `git annex initremote` to configure the
remote:
* `borgrepo` - The location of a borg repository, eg a path, or
`user@host:path` for ssh access.
* `scan` - The path, within the borg repository, to scan for
annex object files. This can be the path to a git-annex repository,
or perhaps a non-encrypted special remote, or a path that contains
several repositories.
Information about all annex objects in the path will be
added to the git-annex branch when syncing with the borg repository.
So, it's best to avoid a path that contains object files for unrelated
git-annex repositories.
## setup example
# borg init --encryption=keyfile /path/to/borgrepo
# git annex initremote borg type=borg borgrepo=/path/to/borgrepo scan=`pwd`
# borg create /path/to/borgrepo `pwd`::{now}
# git annex sync borg

View file

@ -0,0 +1,55 @@
importtree=yes remotes are untrusted, because something is modifying that
remote other than git-annex, and it could change a file at any time, so
git-annex can't rely on the file being there. However, it's possible the user
has a policy of not letting files on the remote be modified. It may even be
that some remotes use storage that avoids such problems. So, there should be
some way to override the default trust level for such remotes.
Currently:
joey@darkstar:/tmp/y8>git annex semitrust borg
semitrust borg
This remote's trust level is overridden to untrusted.
The borg special remote is one example of one where it's easy for the user to
decide they're going to not delete old archives from it, and so want git-annex
to trust it.
Below is some docs I wrote for the borg special remote page, should be
moved there when this gets fixed. --[[Joey]]
## trust levels, borg delete and borg prune
git-annex will by default treat the borg special remote as untrusted, so
will not trust it to continue to contain a [[copy|copies]] of any annexed
file. This is necessary because you could run `borg delete` or `borg prune`
and remove the copy from the borg repository. If you choose to set the
trust level of the borg repository to a higher level, you need to avoid
using such commands with that borg repository.
Consider this example:
git-annex add annexedfile
borg create /path/to/borgrepo `pwd`::foo
git-annex sync borg
git-annex semitrust borg
git-annex drop annexedfile
Now the only copy of annexedfile is in the borg repository.
borg create /path/to/borgrepo `pwd`::bar
borg delete /path/to/borgrepo::foo
git-annex sync borg
git-annex whereis annexedfile
Now no copies of annexfile remain, because the "foo" archive
in the borg repository was the only one to contain it, and it was deleted.
So either keep the borg special remote as untrusted, and use such borg
commands to delete old archives as needed, or avoid using `borg delete`
and `borg prune`, and then the remote can safely be made semitrusted or
trusted.
Also, if you do choose to delete old archives, make sure to never reuse
that archive name for a new archive. git-annex may think it's the same
archive it saw before, and not notice the change.

View file

@ -0,0 +1,5 @@
The tree generated by git-annex sync with a borg remote
does not seem to get grafted into the git-annex branch, so
would be subject to being lost to GC.
Is this a general problem affecting importtree too?

View file

@ -0,0 +1,2 @@
Subject says it all really, sync does not try to get content
from remotes that are thirdPartyPopulated yet.