Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2022-01-19 11:51:11 -04:00
commit fb11ffe594
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 408 additions and 0 deletions

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 2"
date="2022-01-13T18:10:22Z"
content="""
I found an even better workaround, see [[tips/using_nested_git_repositories/]]
"""]]

View file

@ -0,0 +1,292 @@
### Please describe the problem.
Under Windows (not tested in Unix), if an annex is cloned from the .git directory, git annex get fails and is different to cloning from workdir
### What steps will reproduce the problem?
[[!format sh """
shaddy@COMPUTER-W10 U:\Temp
> git annex version
git-annex version: 8.20211118-g2e2d35869
build flags: Assistant Webapp Pairing TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.29 DAV-1.3.4 feed-1.3.2.0 ghc-8.10.7 http-client-0.7.9 persistent-sqlite-2.13.0.3 torrent-10000.1.1 uuid-1.3.15 yesod-1.6.1.2
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: mingw32 x86_64
supported repository versions: 8
upgrade supported from repository versions: 2 3 4 5 6 7
shaddy@COMPUTER-W10 U:\Temp
> dir
Volume in drive U has no label.
Volume Serial Number is D684-6493
Directory of U:\Temp
13/01/2022 01:42 PM <DIR> .
13/01/2022 01:42 PM <DIR> ..
10/04/2020 07:36 PM 14,572,000 vc_redist.x64.exe
1 File(s) 14,572,000 bytes
2 Dir(s) 21,238,497,280 bytes free
shaddy@COMPUTER-W10 U:\Temp
> git init --separate-git-dir separategitmaster.git separategitmaster
Initialized empty Git repository in U:/Temp/separategitmaster.git/
shaddy@COMPUTER-W10 U:\Temp
> cd separategitmaster
shaddy@COMPUTER-W10 U:\Temp\separategitmaster
> git annex init master
init master
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Switched to branch 'adjusted/master(unlocked)'
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separategitmaster
> copy ..\vc_redist.x64.exe .
1 file(s) copied.
shaddy@COMPUTER-W10 U:\Temp\separategitmaster
> git annex add vc_redist.x64.exe
add vc_redist.x64.exe
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separategitmaster
> git commit -m vc_redist.x64.exe
[adjusted/master(unlocked) a32d7ba] vc_redist.x64.exe
1 file changed, 1 insertion(+)
create mode 100755 vc_redist.x64.exe
shaddy@COMPUTER-W10 U:\Temp\separategitmaster
> cd ..
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> REM note, cloning from workdir, not .git
shaddy@COMPUTER-W10 U:\Temp
> git clone separategitmaster separategitslave
Cloning into 'separategitslave'...
done.
shaddy@COMPUTER-W10 U:\Temp
> cd separategitslave
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> dir
Volume in drive U has no label.
Volume Serial Number is D684-6493
Directory of U:\Temp\separategitslave
13/01/2022 01:45 PM <DIR> .
13/01/2022 01:45 PM <DIR> ..
13/01/2022 01:45 PM 108 vc_redist.x64.exe
1 File(s) 108 bytes
2 Dir(s) 21,209,219,072 bytes free
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> git config user.name "Shaddy Baddah"
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> git annex init slave
init slave
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> git annex sync
commit
On branch adjusted/master(unlocked)
Your branch is up to date with 'origin/adjusted/master(unlocked)'.
nothing to commit, working tree clean
ok
pull origin
ok
push origin
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 2 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 718 bytes | 359.00 KiB/s, done.
Total 6 (delta 0), reused 0 (delta 0), pack-reused 0
To U:/Temp/separategitmaster
* [new branch] master -> synced/master
* [new branch] git-annex -> synced/git-annex
ok
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> git annex get vc_redist.x64.exe
get vc_redist.x64.exe (from origin...)
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> REM All ok. But now lets clone from the .git dir
shaddy@COMPUTER-W10 U:\Temp\separategitslave
> cd ..
shaddy@COMPUTER-W10 U:\Temp
> git init --separate-git-dir separatetake2gitmaster.git separatetake2gitmaster
Initialized empty Git repository in U:/Temp/separatetake2gitmaster.git/
shaddy@COMPUTER-W10 U:\Temp
> cd separatetake2gitmaster
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> git annex init master
init master
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Switched to branch 'adjusted/master(unlocked)'
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> copy ..\vc_redist.x64.exe .
1 file(s) copied.
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> git annex add vc_redist.x64.exe
add vc_redist.x64.exe
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> git commit -m vc_redist.x64.exe
git-annex.exe: .\vc_redist.x64.exe: DeleteFile "\\\\?\\U:\\Temp\\separatetake2gitmaster\\vc_redist.x64.exe": permission denied (The process cannot access the file because it is being used by another process.)
error: external filter 'git-annex smudge --clean -- %f' failed 1
error: external filter 'git-annex smudge --clean -- %f' failed
[adjusted/master(unlocked) 50bb6c7] vc_redist.x64.exe
1 file changed, 1 insertion(+)
create mode 100755 vc_redist.x64.exe
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> REM unfortunate Windows gotcha, but I believe it doesnt effect the scenario
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> dir
Volume in drive U has no label.
Volume Serial Number is D684-6493
Directory of U:\Temp\separatetake2gitmaster
13/01/2022 01:47 PM <DIR> .
13/01/2022 01:47 PM <DIR> ..
10/04/2020 07:36 PM 14,572,000 vc_redist.x64.exe
1 File(s) 14,572,000 bytes
2 Dir(s) 21,149,949,952 bytes free
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> cd ..\
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitmaster
> REM cloning from the .git this time
shaddy@COMPUTER-W10 U:\Temp
> git clone separatetake2gitmaster.git separatetake2gitslave
Cloning into 'separatetake2gitslave'...
done.
shaddy@COMPUTER-W10 U:\Temp
> cd separatetake2gitslave
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> git annex init slave
init slave
Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
ok
(recording state in git...)
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> dir
Volume in drive U has no label.
Volume Serial Number is D684-6493
Directory of U:\Temp\separatetake2gitslave
13/01/2022 01:48 PM <DIR> .
13/01/2022 01:48 PM <DIR> ..
13/01/2022 01:48 PM 108 vc_redist.x64.exe
1 File(s) 108 bytes
2 Dir(s) 21,149,868,032 bytes free
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> git annex sync
commit
On branch adjusted/master(unlocked)
Your branch is up to date with 'origin/adjusted/master(unlocked)'.
nothing to commit, working tree clean
ok
pull origin
ok
push origin
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 2 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (6/6), 719 bytes | 719.00 KiB/s, done.
Total 6 (delta 0), reused 0 (delta 0), pack-reused 0
To U:/Temp/separatetake2gitmaster.git
* [new branch] master -> synced/master
* [new branch] git-annex -> synced/git-annex
ok
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> git annex get vc_redist.x64.exe
get vc_redist.x64.exe
Unable to access these remotes: origin
No other repository is known to contain the file.
failed
get: 1 failed
shaddy@COMPUTER-W10 U:\Temp\separatetake2gitslave
> REM this is a confusing situation and repeatable
# End of transcript or log.
"""]]
### What version of git-annex are you using? On what operating system?
shaddy@COMPUTER-W10 U:\Temp
> git annex version
git-annex version: 8.20211118-g2e2d35869
### Please provide any additional information below.
[[!format sh """
Note, in my example, I'm using a .git directory that has been "separated" from its workdir.
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes. Git annex is very good. I acknowledge that the way I use it can be unconventional.

View file

@ -0,0 +1,74 @@
Hi, if anyone can suggest some steps to take to diagnose a sudden performance problem in my git-annex repo, I'd appreciate it.
I have been using git annex for several years on this same repo and performance has never been a problem.
I recently did a OS update of all my packages (arch linux), which updated git annex, and now performance is really bad whenever I run `git status` or `git commit` in the repo (performance is fine in git repos without git annex). For an example of how bad the performance is, I first noticed the problem when trying to make a commit, and I waited 20 minutes before cancelling with Ctrl+C. The update took me from package version "8.20210310-16" to "8.20210803-63". After noticing the problem I updated again to package version "8.20210803-81", since I was hoping there might be a fix in the most recent version, but that didn't resolve it. Those versions might be specific to arch linux, so if that's the case and anybody needs the true versions let me know and I can try to figure out which git-annex commit they are at.
To try to get a sense of what is going on, I started by running `GIT_TRACE_PERFORMANCE=1 git commit`. There seems to be one command that git is running that has a non-trivial time difference between other commands, and its
18:14:31.968248 trace.c:487 performance: 247.576906378 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch-check=%(objectname) %(objecttype) %(objectsize)'
The time difference between this command and the next is over 1 second, and it seems that `git commit` is running this type of command many times. I posted a sample of the full output below.
When searching the web for that git command, I came across this conversation: https://git-annex.branchable.com/todo/speed_up_git_annex_sync_--content_--all/
Could that be related to my issue?
What should I do next to figure out what the problem is?
I posted the output of `git annex version` and `git annex info` below in case that is helpful.
Thanks!
----------
`git annex version`:
git-annex version: 8.20210803-g9cae7c5bb
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.29 DAV-1.3.4 feed-1.3.2.0 ghc-9.0.2 http-client-0.7.9 persistent-sqlite-2.13.0.3 torrent-10000.1.1 uuid-1.3.15 yesod-1.6.1.2
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
`git annex info` (omitting repository info):
available local disk space: 146.18 gigabytes (+1 megabyte reserved)
local annex keys: 2120
local annex size: 2.59 gigabytes
annexed files in working tree: 1945
size of annexed files in working tree: 2.55 gigabytes
bloom filter size: 32 mebibytes (0.4% full)
backend usage:
SHA256E: 1945
sample of the output of `GIT_TRACE_PERFORMANCE=1 git commit`:
18:14:31.949865 trace.c:487 performance: 0.031945775 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch-check=%(objectname) %(objecttype) %(objectsize)' --buffer
18:14:31.950372 trace.c:487 performance: 0.031692879 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch=%(objectname) %(objecttype) %(objectsize)' --buffer
18:14:31.963858 trace.c:487 performance: 0.007070777 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs hash-object --stdin-paths --no-filters
18:14:31.967006 trace.c:487 performance: 247.578879398 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
18:14:31.968248 trace.c:487 performance: 247.576906378 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch-check=%(objectname) %(objecttype) %(objectsize)'
18:14:33.217226 trace.c:487 performance: 0.000427304 s: git command: git config --null --list
18:14:33.244772 read-cache.c:2398 performance: 0.011644939 s: read cache .git/index
18:14:33.245219 read-cache.c:2398 performance: 0.001136355 s: read cache .git/index
18:14:33.263160 trace.c:487 performance: 0.001024382 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs show-ref --hash refs/annex/last-index
18:14:33.274091 read-cache.c:2398 performance: 0.001037856 s: read cache .git/index
18:14:33.284793 unpack-trees.c:1802 performance: 0.000024725 s: traverse_trees
18:14:33.284872 unpack-trees.c:418 performance: 0.000001050 s: check_updates
18:14:33.284917 unpack-trees.c:1892 performance: 0.000349269 s: unpack_trees
18:14:33.289148 diff-lib.c:629 performance: 0.004585392 s: diff-index
18:14:33.289220 trace.c:487 performance: 0.017392572 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs -c filter.annex.smudge= -c filter.annex.clean= -c diff.external= diff d62ecb841c32ff0d62df4f8ca7cb0567616cf415 --staged --raw -z --no-abbrev -G/annex/objects/ --no-renames --ignore-submodules=all --no-ext-diff
18:14:33.294517 trace.c:487 performance: 0.025457813 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch-check=%(objectname) %(objecttype) %(objectsize)' --buffer
18:14:33.297215 trace.c:487 performance: 0.024423869 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch=%(objectname) %(objecttype) %(objectsize)' --buffer
18:14:33.303299 trace.c:487 performance: 0.000746262 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs hash-object --stdin-paths --no-filters
18:14:33.305448 trace.c:487 performance: 0.076031969 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
18:14:33.308580 trace.c:487 performance: 0.075975445 s: git command: git --git-dir=.git --work-tree=. --literal-pathspecs cat-file '--batch-check=%(objectname) %(objecttype) %(objectsize)'
18:14:34.537129 trace.c:487 performance: 0.000424618 s: git command: git config --null --list
18:14:34.570720 read-cache.c:2398 performance: 0.008743858 s: read cache .git/index
18:14:34.573279 read-cache.c:2398 performance: 0.001075431 s: read cache .git/index

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="ainohzoa"
avatar="http://cdn.libravatar.org/avatar/0d6a2dbd95f6c4f410cc41d32beaebe9"
subject="Now it's working normally again"
date="2022-01-14T21:10:13Z"
content="""
After running a `git status` today (the first time running it was slow as usual) the problem seems to have gone away. Perhaps it just needed to rebuild some kind of cache or something. I believe that I had waited for a `git status` command to completely finish before, so maybe it takes several tries to resolve this performance problem? Or possibly I'm misremembering and I never actually waited for the `git status` command to fully finish. I'm baffled by this, but happy that it's back to normal. I'll post an update if the problem comes back.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 6"
date="2022-01-13T18:19:31Z"
content="""
The main problem is that the git developers stubbornly refuse to support nested git repos, so it's impossible for git-annex to support it. However I found a good workaround, see [[tips/using_nested_git_repositories/]].
"""]]

View file

@ -0,0 +1,7 @@
Using nested git repositories in git is not possible and thus this also applies to git-annex. However, here is a good workaround that I found:
Rename the `.git` directory of the nested repo to `dotgit` (or similar), `git annex add` it and then create a symbolic link from `.git` to `dotgit`. It's important that the link is created only after the nested repo has been `git annex add`'ed. Also, the link needs to be created manually on each clone. Finally you'll need to hide the `dotgit` directory from the nested repo itself by adding `/dotgit` to `dotgit/info/exclude`.
mv nested/.git nested/dotgit; echo "/dotgit" >>nested/dotgit/info/exclude
git annex add nested; git commit -m "add nested"
cd nested; ln -s dotgit .git # needs to be done on every clone

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="https://christian.amsuess.com/chrysn"
nickname="chrysn"
avatar="http://christian.amsuess.com/avatar/c6c0d57d63ac88f3541522c4b21198c3c7169a665a2f2d733b4f78670322ffdc"
subject="nested git repositories are git submodules"
date="2022-01-14T13:02:36Z"
content="""
Using nested git repositories is well possible; if they are checked in they are called submodules, otherwise they just sit there unadded.
Apart from some [[odd quirx you never run into in normal operation|bugs/creating dot-git-as-symlink workaround drops worktree configuration from submodules]], submodules work fine also with git-annex.
"""]]