external backends implemented
This commit is contained in:
parent
ea63d1dfe3
commit
049807dbba
6 changed files with 27 additions and 4 deletions
|
@ -1,5 +1,8 @@
|
|||
git-annex (8.20200720.2) UNRELEASED; urgency=medium
|
||||
|
||||
* Added support for external backend programs. So if you want a hash
|
||||
that git-annex doesn't support, or something stranger, you can write a
|
||||
small program to implement it.
|
||||
* Fix a lock file descriptor leak that could occur when running commands
|
||||
like git-annex add with -J. Bug was introduced as part of a different FD
|
||||
leak fix in version 6.20160318.
|
||||
|
|
|
@ -2,4 +2,7 @@ Would it be hard to support MD5E keys that omit the -sSIZE part, the way this is
|
|||
|
||||
Another (and more generally useful) solution would be [[todo/alternate_keys_for_same_content/]]. Then can start with a URL-based key but then attach an MD5 to it as metadata, and have the key treated as a checksum-containing key, without needing to migrate the contents to a new key.
|
||||
|
||||
[[!tag moreinfo]]
|
||||
> Closing, because [[external_backends]] is implemented, so you should be
|
||||
> able to roll your own backend for your use case here. Assuming you can't
|
||||
> just use regular MD5E and omit the file size field, which will work too.
|
||||
> --[[Joey]]
|
||||
|
|
|
@ -4,5 +4,6 @@ It would be good if one could define custom external [[backends]], the way one c
|
|||
|
||||
Thoughts?
|
||||
|
||||
[[!tag needsthought]]
|
||||
[[!tag projects/datalad]]
|
||||
|
||||
> fully implemented. [[done]] --[[Joey]]
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
Would it be hard to add a variantion to checksumming [[backends]], that would change how the checksum is computed: instead of computing it on the whole file, it would first be computed on file chunks of given size, and then the final checksum computed on the concatenation of the chunk checksums? You'd add a new [[key field|internals/key_format]], say cNNNNN, specifying the chunking size (the last chunk might be shorter). Then (1) for large files, checksum computation could be parallelized (there could be a config option specifying the default chunk size for newly added files); (2) I often have large files on a remote, for which I have md5 for each chunk, but not for the full file; this would enable me to register the location of these fies with git-annex without downloading them, while still using a checksum-based key.
|
||||
|
||||
[[!tag needsthought]]
|
||||
> Closing, because [[external_backends]] is implemented, so you should be
|
||||
> able to roll your own backend for your use case here. --[[Joey]]
|
||||
|
|
|
@ -12,4 +12,9 @@ This enables attaching metadata not to file contents, but to the file itself; or
|
|||
deduplication. This loss may be acceptable. The loss can be mitigated for local repo and non-special remotes: after storing an object with e.g. MD5 d41d8cd98f00b204e9800998ecf8427e under .git/annex/objects, check if there is a symlink .git/annex/contenthash/d41d8cd98f00b204e9800998ecf8427e ; if not, make this a symlink to the object just stored; if yes,
|
||||
erase the object just stored, and hardlink the symlink's target instead.
|
||||
|
||||
[[!tag unlikely moreinfo]]
|
||||
> Closing since [[external_backends]] is implemented, and you could do this
|
||||
> using it. Whether that's a good idea, I'm fairly doubtful about. Be sure
|
||||
> to read "considerations for generating keys" in
|
||||
> <https://git-annex.branchable.com/design/external_backend_protocol/#index7h2>
|
||||
>
|
||||
> [[done]] --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 9"""
|
||||
date="2020-07-29T21:22:42Z"
|
||||
content="""
|
||||
[[external_backends]] is now implemented, so you can write a program that
|
||||
makes keys use some other, shorter hash encoding.
|
||||
|
||||
I don't know if that's really sufficient to close this.
|
||||
"""]]
|
Loading…
Reference in a new issue