Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2020-12-28 17:05:00 -04:00
commit 4262ba3c44
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 76 additions and 9 deletions

View file

@ -0,0 +1,9 @@
For some file types, e.g. images and sound, I would like to also add some local sensitive hash (LSH), and be able to find duplicates that way. E.g. that would allow me to find duplicate images where the image metadata was changed, or maybe the quality was changed, etc.
Ideally, I would want to add such meta data to Git-Annex. I'm not sure if this should be a backend ([[backends]]) or kept separate from it (as there will be collisions, by design).
I also want fast lookups for some specific hash. I'm not sure if the Git-Annex metadata allow for that? A Git-Annex backend naturally has this feature, but then how would it handle collisions? I would want that this LSH backend just links to a list of matching files, i.e. symlinks to the real files (via SHA backend or so).
Is anything like that already supported?
If not, would it be possible to add such support to Git-Annex? How?

View file

@ -0,0 +1,17 @@
I have not used git-annex before but I wonder whether this is for me.
I have some big existing directories, e.g. with family pictures (several 100 GBs), e.g. in `~/Pictures`. I already have multiple copies of this pictures directory on multiple medias (other (remote) servers, hard drives, some DVDs, etc).
I wonder about the recommended workflow now. In the documentation, it is explained how to create a new Git Annex Repo, where I would copy over the data. But I don't want to copy over the data. I want to keep them in `~/Pictures`, and also make use of other existing copies (I cannot even modify some of them anymore, such as my readonly DVDs).
I thought that Git-Annex would just help me keep track of multiple copies. How would I import such a directory?
I read briefly about Git Worktree, and I wonder whether that is supposed to be for this use case?
Or maybe this should be a bare repo?
Or should I create the new Git Annex Repo directly in `~/Pictures`? I.e. I would do `cd ~/Pictures; git init; git annex init`? How would I now add the other copies of `Pictures`? How would I deal with readonly copies of `Pictures` like DVDs?
Also, I don't just want to store the pictures but also other stuff (e.g. `~/Music`). I'm not sure if I should create separate repos for that, or whether it makes more sense to keep them all in one big repo?
I read [[how it works]] and [[workflow]] but this does not really answers my questions.

View file

@ -0,0 +1,50 @@
In some occasions `annex.adjustedbranchrefresh` is ignored when `git annex sync` is run in a branch created with `adjust --unlock-present`.
If `annex.adjustedbranchrefresh` is set to 1, one would expect git-annex to automatically adjust the branch once a file has been `git annex add`-ed or the repository is `git annex sync`-ed. However this does not happen and a manual `git annex adjust --unlock-present` is required.
Is this a bug or am I misunderstanding how `annex.adjustedbranchrefresh` is supposed to work?
The following script reproduces this bug.
```
#!/bin/bash
set -eux
rm -Rvf /tmp/an-repo.git && mkdir /tmp/an-repo.git && cd /tmp/an-repo.git
git init --bare
n=$(date +%s) ; mkdir /tmp/ga-$n && cd /tmp/ga-$n
git clone --no-local --no-hardlinks /tmp/an-repo.git
cd an-repo/
git config user.email "email@example.com" ; git config user.name "Name Name"
git config annex.thin true
git config annex.adjustedbranchrefresh 1
git config remote.origin.annex-ignore true
# 8.20201117 is the version in the standalone tarball of 8.20201127
~/Applications/git-annex/8.20201117-ga314537cd/runshell bash -c '
git annex init foobar
echo "aaaa" > a && echo "bbbb" > b
git annex add a b
git annex sync
git annex adjust --unlock-present
git annex sync
echo "cccc" > c && echo "dddd" > d
git annex add c d
echo "## before sync"
stat -c "%n: %F" a b c d
git annex sync
echo "## after sync"
stat -c "%n: %F" a b c d # should show four regular files, but shows two files and two symlinks
git annex sync --content;
echo "## after sync --content"
stat -c "%n: %F" a b c d # ibid
'
```

View file

@ -1,9 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 1"
date="2020-11-20T22:16:34Z"
content="""
If I move a symlink up one directory (thus breaking it), git-annex fix \"symlink-name\" doesn't do anything. the only way I could find to fix it was to do a git annex add $symlink-name. what am I doing wrong?
"""]]