Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
6c8d53e11f
7 changed files with 159 additions and 0 deletions
|
@ -0,0 +1 @@
|
|||
A simple typo is causing this: https://github.com/dotlambda/git-annex/commit/a46eadab78efef81825302fc0de7963a46fc3b52
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="toh_corpora"
|
||||
avatar="http://cdn.libravatar.org/avatar/c4265d106fd775ab35231ea3f9696cb0"
|
||||
subject="comment 9"
|
||||
date="2018-11-27T14:10:53Z"
|
||||
content="""
|
||||
I have had a lot of success with some scripts I have written that externally assist git-annex in moving large files. Perhaps they can be at some point integrated into git-annex. I understand that this probably isn't a priority for Joey.
|
||||
|
||||
One thing that has made a huge difference for me is asynchronous verification of content. I have a turbo-copy script which simply copies keys from one repo to another externally, and then triggers an fsck in the target repo after it has finished copying everything. In many cases, this makes a huge improvement in transfer speed. I might have it start an fsck in the target as soon as each file is copied externally.
|
||||
|
||||
I have also been experimenting with using make to externally manage git-annex pipelines, where it makes more sense to do more simultaneous copies for smaller files, for certain backends.
|
||||
|
||||
One thing that would be amazingly helpful, is if there was a way that a backend could inform git-annex ahead of time what it intends to do. For me, when I encrypt and upload files > 40M < 400M to Google Drive, git-annex spends about 50% of its time encrypting, and 50% uploading. It would substantially improve performance if git-annex could get to work on the next file while the current one is uploading. I don't know if it is possible for a backend to do this. I think that it would have to falsly claim success to get git-annex to give it the next file.l
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="toh_corpora"
|
||||
avatar="http://cdn.libravatar.org/avatar/c4265d106fd775ab35231ea3f9696cb0"
|
||||
subject="comment 5"
|
||||
date="2018-11-27T13:59:36Z"
|
||||
content="""
|
||||
It is not an exaggeration to say that Git Annex has improved my life. I have used it to organize massive collections of files with large efficiency. I have also been using it to organize and process almost 40 years of video content.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="andrew"
|
||||
avatar="http://cdn.libravatar.org/avatar/acc0ece1eedf07dd9631e7d7d343c435"
|
||||
subject="comment 4"
|
||||
date="2018-11-26T15:42:36Z"
|
||||
content="""
|
||||
After thinking about this for a few days I am thinking that the existing `git-annex export` functionality can work well. I think I will have users specify a local directory in their annex called something like `public-share` along with a single public exporttree remote to use with that local share. Whenever the user clicks `Share` on a single file (or folder or multi-selection of files and folders) i'll just create a new sub-directory in `public-share` called something like `public-share/CURRENT_DATETIME/` and place all of the new files to share in there. Then i'll do an export like: `git-annex export master:public-share --to=public-tree-remote`. This takes advantage of the existing export functionality and has the added benefit of giving the user a local record of all files that are currently publicly shared, which seems pretty useful.
|
||||
"""]]
|
|
@ -0,0 +1,65 @@
|
|||
[[!comment format=mdwn
|
||||
username="anarcat"
|
||||
avatar="http://cdn.libravatar.org/avatar/4ad594c1e13211c1ad9edb81ce5110b7"
|
||||
subject="progress?"
|
||||
date="2018-11-27T06:47:26Z"
|
||||
content="""
|
||||
How's that remote going, RonnyPfannschmidt? :) I can't tell from the [homepage](https://github.com/RonnyPfannschmidt/git-annex-borg/) but from the source code, it looks like initremote is supported so far, but not much else...
|
||||
|
||||
From what I remember, borg supports storing arbitrary blobs with the `borg debug-put-obj` function, and retrieve one with `borg debug-get-obj`. Here's an example of how this could work:
|
||||
|
||||
[1145]anarcat@angela:test$ sha256sum /etc/motd
|
||||
a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 /etc/motd
|
||||
[1146]anarcat@angela:test$ borg init -e none repo
|
||||
[1147]anarcat@angela:test$ borg debug-put-obj repo /etc/motd
|
||||
object a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 put.
|
||||
[1148]anarcat@angela:test$ borg debug-get-obj repo a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 tmp
|
||||
object a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 fetched.
|
||||
[1149]anarcat@angela:test$ sha256sum tmp
|
||||
a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 tmp
|
||||
|
||||
This assumes the underlying blob ID in borg is a SHA256 hash, but that
|
||||
seems like a fair assumption to make. Naturally, this could cause
|
||||
problems with git-annex, which supports multiple hashing algorithms
|
||||
thanks to the multiple [[backends]] support. But maybe this can just
|
||||
work this out by refusing to store non-matchin backends.
|
||||
|
||||
That is, if borg actually worked that way. Unfortunately, while the
|
||||
above actually works, the resulting repository is not quite right:
|
||||
|
||||
$ borg debug dump-repo-objs .
|
||||
Dumping 000000_0000000000000000000000000000000000000000000000000000000000000000.obj
|
||||
Data integrity error: Chunk a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921: Invalid encryption envelope
|
||||
|
||||
So borg does not like the repository at all... I'm not sure why, but
|
||||
it sure looks like borg \"objects\" are not as transparent as I
|
||||
hoped and that this low-level interface will not be suitable for
|
||||
git-annex.
|
||||
|
||||
The higher level interface is \"archives\", which have (more or less) a
|
||||
CRUD interface (without the U, really) through the
|
||||
\"create/list/extract/prune\" interface. It's far from what we need:
|
||||
items are deplicated across archives so it means it is impossible to
|
||||
reliably delete a key unless we walk (and modify!) the entire archive list, which is
|
||||
slow and impractical. But it *could* definitely be used to add keys to
|
||||
a repository, using:
|
||||
|
||||
$ time borg create --stdin-name SHA256-a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 .::'{utcnow}' - < /etc/motd
|
||||
1.30user 0.10system 0:01.62elapsed 86%CPU (0avgtext+0avgdata 81464maxresident)k
|
||||
72inputs+1496outputs (0major+31135minor)pagefaults 0swaps
|
||||
|
||||
As you can see, however, that is *slow* (although arguably not slower
|
||||
than `debug-put-obj` which is surprising).
|
||||
|
||||
But even worse, that blob is now hidden behind that archive - you'd
|
||||
need to list all archives (which is also expensive) to find it.
|
||||
|
||||
So I hit a dead end so I'm curious to hear how you were planning to
|
||||
implement this, Ronny. :) Presumably there should be a way to generate
|
||||
an object compatible with `debug-put-obj`, but that interface seems
|
||||
very brittle and has all sorts of warnings all around it... And on the
|
||||
other hand, the archive interface is clunky and slow... I wish there
|
||||
was a better way, and suspect it might be worth talking with upstream
|
||||
(which I'm not anymore) to see if there's a better way to work this
|
||||
problem. -- [[anarcat]]
|
||||
"""]]
|
|
@ -0,0 +1,55 @@
|
|||
[[!comment format=mdwn
|
||||
username="anarcat"
|
||||
avatar="http://cdn.libravatar.org/avatar/4ad594c1e13211c1ad9edb81ce5110b7"
|
||||
subject="restic"
|
||||
date="2018-11-27T07:13:29Z"
|
||||
content="""
|
||||
and for what it's worth, borg's main rival, restic, handles this much better and faster:
|
||||
|
||||
[1331]anarcat@angela:test$ RESTIC_PASSWORD=test restic init -r repo4
|
||||
created restic repository 2c75411732 at repo4
|
||||
|
||||
Please note that knowledge of your password is required to access
|
||||
the repository. Losing your password means that your data is
|
||||
irrecoverably lost.
|
||||
[1334]anarcat@angela:test1$ RESTIC_PASSWORD=test time restic -r repo4 backup --stdin --stdin-filename SHA256-a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 < /etc/motd
|
||||
repository 2c754117 opened successfully, password is correct
|
||||
created new cache in /home/anarcat/.cache/restic
|
||||
|
||||
Files: 1 new, 0 changed, 0 unmodified
|
||||
Dirs: 0 new, 0 changed, 0 unmodified
|
||||
Added to the repo: 656 B
|
||||
|
||||
processed 1 files, 0 B in 0:00
|
||||
snapshot 87c0db00 saved
|
||||
0.55user 0.04system 0:00.80elapsed 73%CPU (0avgtext+0avgdata 48384maxresident)k
|
||||
0inputs+88outputs (0major+9665minor)pagefaults 0swaps
|
||||
[1337]anarcat@angela:test$ RESTIC_PASSWORD=test time restic -r repo4 backup --stdin --stdin-filename SHA256-a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921 < /etc/motd
|
||||
repository 2c754117 opened successfully, password is correct
|
||||
|
||||
Files: 0 new, 1 changed, 0 unmodified
|
||||
Dirs: 0 new, 0 changed, 0 unmodified
|
||||
Added to the repo: 370 B
|
||||
|
||||
processed 1 files, 0 B in 0:00
|
||||
snapshot 5b3af830 saved
|
||||
0.55user 0.04system 0:00.80elapsed 73%CPU (0avgtext+0avgdata 48568maxresident)k
|
||||
0inputs+64outputs (0major+9691minor)pagefaults 0swaps
|
||||
[1348]anarcat@angela:test$ RESTIC_PASSWORD=test time restic -r repo4 backup --stdin --stdin-filename SHA256-533128ceb96cb2a6d8039453c3ecf202586c0e001dce312ecbd6a7a356b201dc < ~/folipon.jpg
|
||||
repository 2c754117 opened successfully, password is correct
|
||||
|
||||
Files: 1 new, 0 changed, 0 unmodified
|
||||
Dirs: 0 new, 0 changed, 0 unmodified
|
||||
Added to the repo: 372 B
|
||||
|
||||
processed 1 files, 0 B in 0:00
|
||||
snapshot 18879aa4 saved
|
||||
0.54user 0.03system 0:00.78elapsed 73%CPU (0avgtext+0avgdata 48504maxresident)k
|
||||
0inputs+64outputs (0major+9700minor)pagefaults 0swaps
|
||||
[1349]anarcat@angela:test$ RESTIC_PASSWORD=test time restic -r repo4 dump latest SHA256-533128ceb96cb2a6d8039453c3ecf202586c0e001dce312ecbd6a7a356b201dc | sha256sum -
|
||||
0.50user 0.02system 0:00.73elapsed 72%CPU (0avgtext+0avgdata 47848maxresident)k
|
||||
0inputs+8outputs (0major+9513minor)pagefaults 0swaps
|
||||
533128ceb96cb2a6d8039453c3ecf202586c0e001dce312ecbd6a7a356b201dc -
|
||||
|
||||
Of course it doesn't validate those checksums, and might freak out with the number of snapshots we would create, but it's a much better start than borg. ;)
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="techcustomersupport"
|
||||
avatar="http://cdn.libravatar.org/avatar/150632946f1b1da29beadfb6e4ce513b"
|
||||
subject="adding a remote"
|
||||
date="2018-11-26T11:43:08Z"
|
||||
content="""
|
||||
To add a new remote, use the git remote add command on the terminal, in the directory your repository is stored at. The git remote add command takes two arguments: A remote name, for example, origin. For more information visit - <a href=\"https://www.applesupportphonenumbers.com/blog/fix-mac-error-code-36/\">https://www.applesupportphonenumbers.com/blog/fix-mac-error-code-36/</a>
|
||||
"""]]
|
Loading…
Add table
Reference in a new issue