From e37dddbacbbd531fddfc612cad97c1730a22effd Mon Sep 17 00:00:00 2001 From: yarikoptic Date: Fri, 13 May 2022 15:12:29 +0000 Subject: [PATCH 1/4] Added a comment --- .../comment_6_a383c7303f1e942a830d9f730f1c0f00._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_6_a383c7303f1e942a830d9f730f1c0f00._comment diff --git a/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_6_a383c7303f1e942a830d9f730f1c0f00._comment b/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_6_a383c7303f1e942a830d9f730f1c0f00._comment new file mode 100644 index 0000000000..00a24e2cf8 --- /dev/null +++ b/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_6_a383c7303f1e942a830d9f730f1c0f00._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="comment 6" + date="2022-05-13T15:12:29Z" + content=""" +I guess fsck could just lock the entire repo for its duration forbidding any operation? I would love to be able to migrate the layout also on \"older\" versions of repo/annex without upgrading all the way to 10. Meanwhile I think I am doomed to write a little helper to do those renames (once, and hopefully never ever again... I might even protect myself by making those top xxx known to me now non-writable at the level of ACL, so the attempt to migrate would lead to an error) +"""]] From 507989130226f7d226f29d691411947d9d281ed6 Mon Sep 17 00:00:00 2001 From: yarikoptic Date: Fri, 13 May 2022 16:51:23 +0000 Subject: [PATCH 2/4] Added a comment: shell helper --- .../comment_7_e2d45ac485456ed9cb86f8e6bd23617c._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_7_e2d45ac485456ed9cb86f8e6bd23617c._comment diff --git a/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_7_e2d45ac485456ed9cb86f8e6bd23617c._comment b/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_7_e2d45ac485456ed9cb86f8e6bd23617c._comment new file mode 100644 index 0000000000..e9eb6decf1 --- /dev/null +++ b/doc/todo/command_to___34__migrate__34___from_adjusted_mode/comment_7_e2d45ac485456ed9cb86f8e6bd23617c._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="shell helper" + date="2022-05-13T16:51:22Z" + content=""" +FWIW made this shell helper to migrate all keys into desired layout: [https://raw.githubusercontent.com/datalad/datalad/maint/tools/convert-git-annex-layout](https://raw.githubusercontent.com/datalad/datalad/maint/tools/convert-git-annex-layout) +"""]] From 6b4a0fa74c02caef3419245e00e4b98942b9ed1d Mon Sep 17 00:00:00 2001 From: wzhd Date: Sat, 14 May 2022 03:10:18 +0000 Subject: [PATCH 3/4] Added a comment: Using fuse --- ...ment_11_e3f82edf70d28f00d30cfd98f189cd86._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/__34__du__34___equivalent_on_an_annex__63__/comment_11_e3f82edf70d28f00d30cfd98f189cd86._comment diff --git a/doc/forum/__34__du__34___equivalent_on_an_annex__63__/comment_11_e3f82edf70d28f00d30cfd98f189cd86._comment b/doc/forum/__34__du__34___equivalent_on_an_annex__63__/comment_11_e3f82edf70d28f00d30cfd98f189cd86._comment new file mode 100644 index 0000000000..1c79810089 --- /dev/null +++ b/doc/forum/__34__du__34___equivalent_on_an_annex__63__/comment_11_e3f82edf70d28f00d30cfd98f189cd86._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="wzhd" + avatar="http://cdn.libravatar.org/avatar/1795a91af84f4243a3bf0974bc8d79fe" + subject="Using fuse" + date="2022-05-14T03:10:18Z" + content=""" +Wrote a bare minimum [fuse fs](https://codeberg.org/wzhd/annexize/) so that du-like utilities like ncdu, gt5, gdu can be used. + +It reads each symlink target, try to get a number after `SHA256E-s`, and pretends it's regular file with that size. `git-annex add`ed files don't need to be locally available. + +Files can be deleted but no other operations are implemented. +"""]] From 514f50e5beac2d69b1145aac2e6e2df1bda31028 Mon Sep 17 00:00:00 2001 From: "nick.guenther@e418ed3c763dff37995c2ed5da4232a7c6cee0a9" Date: Sun, 15 May 2022 21:30:50 +0000 Subject: [PATCH 4/4] Added a comment: How to disable lockdown in bare repos? --- ..._caf6d5318703d188a2135737093d8323._comment | 421 ++++++++++++++++++ 1 file changed, 421 insertions(+) create mode 100644 doc/internals/lockdown/comment_3_caf6d5318703d188a2135737093d8323._comment diff --git a/doc/internals/lockdown/comment_3_caf6d5318703d188a2135737093d8323._comment b/doc/internals/lockdown/comment_3_caf6d5318703d188a2135737093d8323._comment new file mode 100644 index 0000000000..7ddbca9a40 --- /dev/null +++ b/doc/internals/lockdown/comment_3_caf6d5318703d188a2135737093d8323._comment @@ -0,0 +1,421 @@ +[[!comment format=mdwn + username="nick.guenther@e418ed3c763dff37995c2ed5da4232a7c6cee0a9" + nickname="nick.guenther" + avatar="http://cdn.libravatar.org/avatar/9e85c6ca61c3f877fef4f91c2bf6e278" + subject="How to disable lockdown in bare repos?" + date="2022-05-15T21:30:50Z" + content=""" +I've set up a project server for my team with annexes in most repos. I'm using [gitolite](https://gitolite.com) with its [git-annex-shell](https://github.com/sitaramc/gitolite/blob/master/src/commands/git-annex-shell) plugin. It's been going well for a year, and my team finds git-annex very useful for managing our large projects, so we have a large debt to you for that :) + +## Problem + +But when my users delete repos, the repos aren't fully deleted because any `annex/objects/*/*/SHA256-*/SHA256-*` file is locked down. + + +### Gitolite + +For example: + +
test-git-annex-write + +``` + +test-gitea-annex-write() { +REPO=$1; shift + + +(set -e; cd $(mktemp -d) + git init + echo '# testing' > README.md && git add README.md && git commit -m \"Initial commit\" + git annex init + dd if=/dev/urandom of=large.bin bs=1M count=16 && git annex add large.bin && git commit -m \"Annex a file\" + + git remote add origin \"$REPO\" + git config annex.jobs 1 + git annex sync --content origin + git annex sync --content origin # it only uploads the branch, but doesn't upload content, if I only do this once +) +} +``` + +
+ +
Create+Upload: test-gitea-annex-write git@data:datasets/test-jank.git + +``` +$ test-gitea-annex-write git@data:datasets/test-jank.git +Initialized empty Git repository in /tmp/tmp.ak87a0yp1e/.git/ +[master (root-commit) 0a7be36] Initial commit + 1 file changed, 1 insertion(+) + create mode 100644 README.md +init ok +(recording state in git...) +16+0 records in +16+0 records out +16777216 bytes (17 MB, 16 MiB) copied, 0.0608327 s, 276 MB/s +add large.bin +ok +(recording state in git...) +[master 4a55ea5] Annex a file + 1 file changed, 1 insertion(+) + create mode 120000 large.bin + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. + + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +On branch master +nothing to commit, working tree clean +commit ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +ok +push origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +hint: Using 'master' as the name for the initial branch. This default branch name +hint: is subject to change. To configure the initial branch name to use in all +hint: of your new repositories, which will suppress this warning, call: +hint: +hint: git config --global init.defaultBranch +hint: +hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and +hint: 'development'. The just-created branch can be renamed via this command: +hint: +hint: git branch -m +Initialized empty Git repository in /srv/git/repositories/datasets/test-jank.git/ +ok + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +On branch master +nothing to commit, working tree clean +commit ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +copy large.bin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +(to origin...) ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +(recording state in git...) +push origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +``` + +
+ +
Delete the repo + +``` +$ ssh git@data D unlock datasets/test-jank +'datasets/test-jank' is now unlocked +$ ssh git@data D rm datasets/test-jank +rm: cannot remove 'datasets/test-jank.git/annex/objects/968/4c0/SHA256E-s16777216--9d8ccb3ebe399a8f6801cde009e03a867151ea4e4bc609848abbd29dd335688f.bin/SHA256E-s16777216--9d8ccb3ebe399a8f6801cde009e03a867151ea4e4bc609848abbd29dd335688f.bin': Permission denied +'datasets/test-jank' is now gone! +``` + +
+ +Notice the \"Permission denied\" error -- but gitolite *thinks* its work is done: + +``` +$ ssh git@data info | grep test-jank +$ +``` + +but if I try to recreate the same repo, it fails: + +
test-gitea-annex-write git@data:datasets/test-jank.git + +``` +$ test-gitea-annex-write git@data:datasets/test-jank.git +Initialized empty Git repository in /tmp/tmp.IBSNFKRgRg/.git/ +[master (root-commit) e11344b] Initial commit + 1 file changed, 1 insertion(+) + create mode 100644 README.md +init ok +(recording state in git...) +16+0 records in +16+0 records out +16777216 bytes (17 MB, 16 MiB) copied, 0.0631523 s, 266 MB/s +add large.bin +ok +(recording state in git...) +[master 0cedc5c] Annex a file + 1 file changed, 1 insertion(+) + create mode 120000 large.bin + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: R any datasets/test-jank nguenther DENIED by fallthru +(or you mis-spelled the reponame) +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: R any datasets/test-jank nguenther DENIED by fallthru +(or you mis-spelled the reponame) +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +On branch master +nothing to commit, working tree clean +commit ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +FATAL: R any datasets/test-jank nguenther DENIED by fallthru +(or you mis-spelled the reponame) +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +ok +push origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +FATAL: W any datasets/test-jank nguenther DENIED by fallthru +(or you mis-spelled the reponame) +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +FATAL: W any datasets/test-jank nguenther DENIED by fallthru +(or you mis-spelled the reponame) +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. + + Pushing to origin failed. +failed +sync: 1 failed +``` + +
+ +Because of course if I log in on the server I can see: + +``` +git@data:~/repositories/datasets$ tree test-jank.git/ +test-jank.git/ +└── annex + └── objects + └── 968 + └── 4c0 + └── SHA256E-s16777216--9d8ccb3ebe399a8f6801cde009e03a867151ea4e4bc609848abbd29dd335688f.bin + └── SHA256E-s16777216--9d8ccb3ebe399a8f6801cde009e03a867151ea4e4bc609848abbd29dd335688f.bin + +5 directories, 1 file +``` + + +### Gitea + +I've been [porting `git-annex-shell`](https://github.com/neuropoly/gitea/pull/1/) into [Gitea](https://gitea.io/) as well to get a more familiar UI for my team, and I have discovered the exact same problem there: if `test-git-annex-write` to my test instance, then delete that repo, Gitea dutifully reports + +> The repository has been deleted. + +but if I then try to recreate it balks with + +> Files already exist for this repository. Either adopt them or delete them. + +## Solutions + +There doesn't seem to be much benefit to lockdown in a bare repo: there's no checkout that might corrupt the content. Plus in my case there's gitolite/gitea in the way which is an extra layer of protection against direct modification. So could **lockdown be turned off**? + +I'd like it best if you detected when you're run in a bare repo and skipped the freezing and thawing steps. But I'd also just be able to work with a config setting (`git config --global annex.lockdown false`?). + +#### Workaround #1 + +In the meantime, so far I have found one workaround: I can misuse [`annex.freezecontent-command`](https://git-annex.branchable.com/todo/lockdown_hooks/): + +``` +ssh root@data +su -l git +git config --global annex.freezecontent-command \"chmod -R +w %path\" +``` + +
Example + +``` +$ test-gitea-annex-write git@data:datasets/test-jank3 +Initialized empty Git repository in /tmp/tmp.I9wno7oZXg/.git/ +[master (root-commit) 4f47840] Initial commit + 1 file changed, 1 insertion(+) + create mode 100644 README.md +init ok +(recording state in git...) +16+0 records in +16+0 records out +16388608 bytes (16.2 MB, 16.0 MiB) copied, 0.0313028 s, 268 MB/s +add large.bin +ok +(recording state in git...) +[master 27e4c8c] Annex a file + 1 file changed, 1 insertion(+) + create mode 120000 large.bin + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. + + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + annex.sshcaching is not set to true + + Unable to parse git config from origin +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +On branch master +nothing to commit, working tree clean +commit ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +FATAL: autocreate denied + +fatal: Could not read from remote repository. + +Please make sure you have the correct access rights +and the repository exists. +ok +push origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +hint: Using 'master' as the name for the initial branch. This default branch name +hint: is subject to change. To configure the initial branch name to use in all +hint: of your new repositories, which will suppress this warning, call: +hint: +hint: git config --global init.defaultBranch +hint: +hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and +hint: 'development'. The just-created branch can be renamed via this command: +hint: +hint: git branch -m +Initialized empty Git repository in /srv/git/repositories/datasets/test-jank3.git/ +ok + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +On branch master +nothing to commit, working tree clean +commit ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +copy large.bin + + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + annex.sshcaching is not set to true +(to origin...) ok +pull origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +(recording state in git...) +push origin + You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time. + + annex.sshcaching is not set to true +ok +$ ssh git@data D unlock datasets/test-jank3 +'datasets/test-jank3' is now unlocked +$ ssh git@data D rm datasets/test-jank3 +'datasets/test-jank3' is now gone! +``` + +(notice it doesn't give any error this time) + +
+ +But this only works with a relatively new git-annex; I haven't looked up when this went in, but I know 8.20210223, from barely a year ago, doesn't have this feature, while 10.20220322 does. And also it's very much a workaround: it immediately undoes the work git-annex does, which will cause unnecessary disk I/O. + +Here's an upgrade to this workaround, that limits the effect to bare repos (though the only repos ever created by the remote `git` user should be bare): + +``` +git config --global annex.freezecontent-command 'sh -c '\"'\"'[ \"$(git config core.bare)\" = \"true\" ] && chmod -R +w %path'\"'\" +``` + + +#### Workaround #2 + +The advice above says to + +> (The only bad consequence of this is that `rm -rf .git` doesn't work unless you first run `chmod -R +w .git`) + +so another solution would be to patch gitolite/gitea's `rm` subroutines to be git-annex aware, i.e. to run `chmod -R +w` before doing anything else. + +That looks more feasible for me to do in gitea, where git-annex support is turning out to need a whole bunch of patches scattered across the codebase, but it's a lot less appealing to do in gitolite where git-annex support is currently contained in one very elegant file. +"""]]