Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2020-05-19 13:01:08 -04:00
commit f46006a9c1
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 168 additions and 0 deletions

View file

@ -0,0 +1,69 @@
### Please describe the problem.
Unlocking a large file on the Synology NAS results in "Cannot handle files this big" errors with every following other git command (git status, git diff, ... and hereby git annex status/sync etc.).
Unfortunately I miss the technical understanding how the pointer files are hidden from git - I've seen the smudge/clean filters though.
How does it work?
After a file is unlocked, it is physically part of the working tree and also part of the git history, so a git status/diff will naturally try to index/check that file without running the annex filter, which then results in this error message, right?
So how should these unlocked pointer files be hidden technically to not be indexed and checked by git?
### What steps will reproduce the problem?
$ git init
Initialized empty Git repository in /volume1/homes/admin/git-annex/test3/.git/
$ git annex init --version 8
init (scanning for unlocked files...)
ok
(recording state in git...)
$ ls -lah
total 12K
drwxr-xr-x 3 admin users 4.0K May 16 17:00 .
drwxr-xr-x 8 admin users 4.0K May 16 17:00 ..
drwxr-xr-x 9 admin users 4.0K May 16 17:01 .git
-rw-r--r-- 1 admin users 20G May 16 17:00 output
$ git annex add output
add output
$ git annex sync
$ git annex unlock output
$ git annex status
fatal: Cannot handle files this big
git-annex: git status failed
$ git status
fatal: Cannot handle files this big
$ git diff
fatal: Cannot handle files this big
Why does git even feel responsible to load this file?
### What version of git-annex are you using? On what operating system?
Synology NAS
git-annex version: 8.20200331-g111b747be
build flags: Assistant Webapp Pairing S3 WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.1.0 ghc-8.6.5 http-client-0.5.14 persistent-sqlite-2.9.3 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external
operating system: linux arm
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
git version 2.26.1
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Unlocking files (with thin set to true) seems to be the perfect solution for me - I just have to understand what's going on under the hood ;)

View file

@ -0,0 +1,57 @@
### Please describe the problem.
With a newer version of git annex (8.20200226 vs 6.20170101) .git/annex/creds folder is not created after `git annex enableremote s3`
With an older version 6.20170101 creds are embedded all-right.
[[!format sh """
git annex info s3:
remote: s3
type: S3
creds: embedded in git repository (not encrypted)
"""]]
### What steps will reproduce the problem?
This is a bit tricky, as the remote was setup long ago, god knows with which git-annex version, but after you have an S3 remote, with embedcreds=yes:
run `git annex enableremote s3` with versions 8.20200226 and 6.20170101
[[!format sh """
# version 8.20200226
$ git annex enableremote s3
$ ls .git/annex/creds
ls: cannot access '.git/annex/creds': No such file or directory
# version 6.20170101
$ git annex enableremote s3
$ ls ../release-archive__/.git/annex/creds
<s3-UUID>
$ git annex info s3:
remote: s3
type: S3
creds: embedded in git repository (not encrypted)
"""]]
### What version of git-annex are you using? On what operating system?
8.20200226 and 6.20170101
Linux, Ubuntu 20.04
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="jeanpmbox-456@7222359de8d1f37a7cf25a519e8faf90a9517b50"
nickname="jeanpmbox-456"
avatar="http://cdn.libravatar.org/avatar/164eb4254c5f83d95d3e0b810ff7aab9"
subject="comment 4"
date="2020-05-15T21:21:29Z"
content="""
Is there an option for git-annex to recognize hard links inside a repository?
I have a repository where I want a file to be in different places but when I git-annex this file I can't preserve the hard link structure.
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="jeanpmbox-456@7222359de8d1f37a7cf25a519e8faf90a9517b50"
nickname="jeanpmbox-456"
avatar="http://cdn.libravatar.org/avatar/164eb4254c5f83d95d3e0b810ff7aab9"
subject="comment 3"
date="2020-05-15T21:14:53Z"
content="""
It came when I installed git-annex, I did not ask nor tick a box to have this script loaded at startup. I installed git-annex in a different directory but the entry which was added to the start menu did not take that into account, and pointed to the default installation directory.
"""]]

View file

@ -0,0 +1,22 @@
I have a large annex (~200k files) on a server with a thin checkout on my laptop.
The scenario I was trying to achieve is to by default have:
- All files always present on the server
- Only files added locally/pulled manually on the laptop
It seems this can be achieved by setting preferred content to `present` on the laptop, required `*` on the server and then regularly calling
```
$ git annex sync --content
```
However, this is very slow (compared to --no-content, which takes seconds), it seems to iterate the whole repository for presence.
I'm not too familiar with git-annex's internals, but it seems finding a sparse set of present files is already implemented efficiently:
```
$ time git annex find --in=here
```
takes 5 seconds to complete on the laptop.