Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2025-01-07 13:21:49 -04:00
commit 8bc88945e9
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
9 changed files with 301 additions and 78 deletions

View file

@ -0,0 +1,45 @@
### Please describe the problem.
The main repository `git://git-annex.branchable.com/` seems to be offline at the moment.
The mirror `git://git.joeyh.name/git-annex` seems to work.
### What steps will reproduce the problem?
A git clone of the main repo gives an error message "fatal: Could not read from remote repository.".
### What version of git-annex are you using? On what operating system?
.
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
C:\Users\jkniiv\Projektit\git-annex.branchable.com> git clone git://git-annex.branchable.com/ git-annex-clone-TEST
Cloning into 'git-annex-clone-TEST'...
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
C:\Users\jkniiv\Projektit\git-annex.branchable.com> ping git-annex.branchable.com
Pinging git-annex.branchable.com [2600:3c03::f03c:91ff:fedf:c0e5] with 32 bytes of data:
Reply from 2600:3c03::f03c:91ff:fedf:c0e5: time=126ms
Reply from 2600:3c03::f03c:91ff:fedf:c0e5: time=116ms
Reply from 2600:3c03::f03c:91ff:fedf:c0e5: time=118ms
Reply from 2600:3c03::f03c:91ff:fedf:c0e5: time=117ms
Ping statistics for 2600:3c03::f03c:91ff:fedf:c0e5:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 116ms, Maximum = 126ms, Average = 119ms
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Sure. It's always rockin'!
[[!meta author=jkniiv]]

View file

@ -0,0 +1,73 @@
### Please describe the problem.
`encfs` I think in general is supported but there are some minor glitches (submitting one about ssh separately) which seems complicate use of git-annex under it.
### What steps will reproduce the problem?
Running `git annex test` causes a few tests to FAIL. After `git-annex` gets to operate "nominally" under encfs, we better add testing under `encfs` to [datalad/git-annex daily tests](https://github.com/datalad/git-annex/blob/master/.github/workflows/build-ubuntu.yaml#L247) .
Sample fail:
```
Tests
Repo Tests v10 locked
Init Tests
init: OK (0.97s)
add: OK (2.44s)
addurl: OK (2.65s)
conflict resolution (mixed locked and unlocked file): OK (9.30s)
version: OK (0.99s)
fix: FAIL (0.11s)
./Test/Framework.hs:86:
git clone failed with unexpected exit code (transcript follows)
fatal: hardlink different from source at 'tmprepo4/.git/objects/4b/825dc642cb6eb9a060e54bf8d69288fbee4904'
Use -p '/fix/' to rerun this test only.
```
and overall it is seems v10 locked specific tests
```
~/proj/CON/utils/bin/show-paths -f full-lines -e FAIL < .duct/logs/2025.01.04T19.33.23-680497_stdout
1350 Tests
1351 Repo Tests v10 locked
1358: fix: FAIL (0.11s)
1403 Tests
1404 Repo Tests v10 locked
1411: partial commit: FAIL (0.06s)
1417: reinject: FAIL (0.06s)
1449 Tests
1450 Repo Tests v10 locked
1457: edit (no pre-commit): FAIL (0.02s)
1463: magic: FAIL (0.03s)
```
### What version of git-annex are you using? On what operating system?
```shell
git annex version
git-annex version: 10.20241031
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Servant Benchmark Feeds Testsuite S3 WebDAV
dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.34 DAV-1.3.4 feed-1.3.2.1 ghc-9.6.6 http-client-0.7.17 persistent-sqlite-2.13.3.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL GITBUNDLE GITMANIFEST VURL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external
operating system: linux x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
```
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 2"
date="2025-01-07T01:06:57Z"
content="""
2nd demonstration on [add_config_var_preventing_adjusted_branch_mode/#comment-...](https://git-annex.branchable.com/projects/datalad/bugs-done/add_config_var_preventing_adjusted_branch_mode/#comment-d1335f67352cc698862464515363f061) demonstrates another scenario with undesired effect of failing to freeze/thaw -- git-annex just switches to adjusted branches mode. In that case there is likely no harm yet done to unwind and it would have better errored out altogether.
"""]]

View file

@ -1,78 +0,0 @@
[[!comment format=mdwn
username="jkniiv"
avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
subject="comment 3"
date="2025-01-06T08:54:05Z"
content="""
> AFAIK, git-remote-annex is not installed on windows. I assume you set up the link to git-annex yourself.
That's right. Knowing that some of the functionality of git-remote-annex was already present I made
a symlink in a directory in my `$env:PATH` with `cmd /c mklink git-remote-annex ..\path-to\git-annex.exe`
(I have developer mode active in Windows settings) and that made git-remote-annex available to me/git-annex.
> This is a puzzling problem to me. I don't know anything about windows readonly attributes. But I don't think git-annex would ever set them.
>
>Indeed, it never freezes content on windows at all. That can be seen in the debug output you posted, where it does say it's \"thawing content\", but never \"freezing content\".
That is puzzling. Could the thawing do the wrong/opposite thing in some cases? Also, remember
that anything that does `chmod u-w` by way of Git Bash's own shell (sh.exe/bash.exe), e.g. via `sh -c 'chmod u-w file'`, will effect
a Windows readonly attribute on that file (thanks to MSYS2 behind the scenes):
```
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> touch ddd
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> ls
Directory: e:\git-annex-tests\test-git-remote-annex\annex-c
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 1.6.2024 23:28 5 a-1
-a--- 1.6.2024 23:28 5 b-2
-a--- 1.6.2024 23:28 7 c-3
-a--- 6.1.2025 10:31 0 ddd
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> sh -c 'chmod u-w ddd'
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> ls
Directory: e:\git-annex-tests\test-git-remote-annex\annex-c
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 1.6.2024 23:28 5 a-1
-a--- 1.6.2024 23:28 5 b-2
-a--- 1.6.2024 23:28 7 c-3
-ar-- 6.1.2025 10:31 0 ddd
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> rm ddd
Remove-Item: You do not have sufficient access rights to perform this operation or the item is hidden, system, or read only.
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> sh -c 'type chmod'
chmod is /usr/bin/chmod
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> sh -c 'type mount'
mount is /usr/bin/mount
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> sh -c mount
C:/scoop/apps/git/2.47.1 on / type ntfs (binary,noacl,auto)
C:/scoop/apps/git/2.47.1/usr/bin on /bin type ntfs (binary,noacl,auto)
C:/Users/jkniiv/AppData/Local/Temp on /tmp type ntfs (binary,noacl,posix=0,usertemp)
C: on /c type ntfs (binary,noacl,posix=0,user,noumount,auto)
D: on /d type ntfs (binary,noacl,posix=0,user,noumount,auto)
E: on /e type ntfs (binary,noacl,posix=0,user,noumount,auto)
```
But I guess git-annex isn't calling out to external `chmod` in these cases.
>
>If you had somehow configured a freeze hook that set the readonly attribute, it would run it on windows. I suppose you would have thought to mention if that was the case though.
Nope. No freeze/thaw hooks set. I'm not that brave nor cognizant of how to script them in a fool proof manner. :)
```
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git config get annex.freezecontent-command
e:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git config get annex.thawcontent-command
```
>
>Also rather puzzling is that this is a temp object file, and not a .git/annex/objects/ file. So the failure is apparently happening in the middle of downloading the GITBUNDLE object, before it gets moved to that location. But the same code is run at that point as by any download of any git-annex object.
"""]]

View file

@ -0,0 +1,129 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 5"
date="2025-01-07T01:03:24Z"
content="""
well, `git-annex` is calling `init` upon initial `get` if it was not `init`ed before.
In our use case I think user cloned repository and then invoked `git annex get` within a container environment which did either not have access to original ~/.gitconfig or just to those thaw/freeze scripts. Result is the same as git-annex does not care if configured to be executed scripts fail to execute and plows forward switching to adjusted branches mode instead of errorring out.
<details>
<summary>execution where `~/.gitconfig` is not bound at all - switches since there is no global configuration for thaw/freeze. Having some global variable preventing switching to adjusted mode also would have been of no help </summary>
```
[d31548v@discovery-01 tmp]$ git config annex.thawcontent-command
/dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content %path
[d31548v@discovery-01 tmp]$ singularity exec -B $PWD -c --cleanenv /dartfs/rc/lab/D/DBIC/DBIC/archive/containers/images/nipy/nipy-heudiconv--1.3.2.sing git config annex.thawcontent-command
[d31548v@discovery-01 tmp]$ ls
acl-with-separate-fd-aces facl hello.txt now testdir yohdir
[d31548v@discovery-01 tmp]$ git clone https://github.com/dandisets/000027
Cloning into '000027'...
remote: Enumerating objects: 198, done.
remote: Counting objects: 100% (198/198), done.
remote: Compressing objects: 100% (121/121), done.
remote: Total 198 (delta 79), reused 171 (delta 52), pack-reused 0 (from 0)
Receiving objects: 100% (198/198), 24.36 KiB | 1.06 MiB/s, done.
Resolving deltas: 100% (79/79), done.
[d31548v@discovery-01 tmp]$ singularity exec -B $PWD -c --cleanenv /dartfs/rc/lab/D/DBIC/DBIC/archive/containers/images/nipy/nipy-heudiconv--1.3.2.sing git -C $PWD/000027 annex get sub-RAT123/sub-RAT123.nwb
Filesystem does not allow removing write bit from files.
Detected a crippled filesystem.
Disabling core.symlinks.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Switched to branch 'adjusted/draft(unlocked)'
hint: The '.git/hooks/post-checkout' hook was ignored because it's not set as executable.
hint: You can disable this warning with `git config advice.ignoredHook false`.
Remote origin not usable by git-annex; setting annex-ignore
https://github.com/dandisets/000027/config download failed: Not Found
get sub-RAT123/sub-RAT123.nwb (from web...)
ok
(recording state in git...)
```
</details>
<details>
<summary>and execution where I do bind ~/.gitconfig but then scripts themselves are not available. Here if git-annex just failed -- used might have had a better chance to understand the issue and would have not ended up in adjusted branches mode</summary>
```shell
Cloning into '000027'...
remote: Enumerating objects: 198, done.
remote: Counting objects: 100% (198/198), done.
remote: Compressing objects: 100% (121/121), done.
remote: Total 198 (delta 79), reused 171 (delta 52), pack-reused 0 (from 0)
Receiving objects: 100% (198/198), 24.36 KiB | 891.00 KiB/s, done.
Resolving deltas: 100% (79/79), done.
[d31548v@discovery-01 tmp]$ singularity exec -B $PWD -B $HOME/.gitconfig -c --cleanenv /dartfs/rc/lab/D/DBIC/DBIC/archive/containers/images/nipy/nipy-heudiconv--1.3.2.sing git -C $PWD/000027 annex get sub-RAT123/sub-RAT123.nwb
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/freeze-content: not found
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content: not found
Filesystem allows writing to files whose write bit is not set.
Detected a crippled filesystem.
Disabling core.symlinks.
Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
Switched to branch 'adjusted/draft(unlocked)'
hint: The '.git/hooks/post-checkout' hook was ignored because it's not set as executable.
hint: You can disable this warning with `git config advice.ignoredHook false`.
Remote origin not usable by git-annex; setting annex-ignore
https://github.com/dandisets/000027/config download failed: Not Found
get sub-RAT123/sub-RAT123.nwb (from web...)
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content: not found
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/freeze-content: not found
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content: not found
/usr/lib/git-annex.linux/shimmed/sh/sh: 1: /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/freeze-content: not found
ok
(recording state in git...)
```
</details>
So, indeed -- likely having a config variable would have been of lesser value than if git-annex just errorred out as soon as configured scripts failed to execute since they are not available.
*Might be worth a separate issue*: another \"bad\" thing happens is when `git annex init` is ran with freeze/thaw available, so it does not switch to `adjusted` branch mode, but user later invokes `annex get` without scripts being configured -- files silently become unprotected and writeable.
<details>
<summary>demonstration</summary>
```shell
[d31548v@discovery-01 tmp]$ git clone https://github.com/dandisets/000027
Cloning into '000027'...
gremote: Enumerating objects: 198, done.
remote: Counting objects: 100% (198/198), done.
remote: Compressing objects: 100% (121/121), done.
remote: Total 198 (delta 79), reused 171 (delta 52), pack-reused 0 (from 0)
Receiving objects: 100% (198/198), 24.36 KiB | 831.00 KiB/s, done.
Resolving deltas: 100% (79/79), done.
t -[d31548v@discovery-01 tmp]$ git -C 000027 annex init
init
Remote origin not usable by git-annex; setting annex-ignore
https://github.com/dandisets/000027/config download failed: Not Found
ok
(recording state in git...)
[d31548v@discovery-01 tmp]$ singularity exec -B $PWD -c --cleanenv /dartfs/rc/lab/D/DBIC/DBIC/archive/containers/images/nipy/nipy-heudiconv--1.3.2.sing git -C $PWD/000027 annex get sub-RAT123/sub-RAT123.nwb
get sub-RAT123/sub-RAT123.nwb (from web...)
ok
(recording state in git...)
[d31548v@discovery-01 tmp]$ ls -lL 000027/sub-RAT123/sub-RAT123.nwb
-rw-rw----+ 1 d31548v rc-DBIC 18792 Jan 6 19:54 000027/sub-RAT123/sub-RAT123.nwb
[d31548v@discovery-01 tmp]$ echo 123 >> 000027/sub-RAT123/sub-RAT123.nwb
```
</details>
Hence overall suggesting that there should be some repository `.git/config` setting which would be set \"permanently\" upon `git annex init` to reliably enforce consistent use of the same thaw/freeze logic -- either it be built-in, or via specific tandem of freeze/thaw scripts (potentially related issue [on relative paths for configs](https://git-annex.branchable.com/todo/specify_freeze__47__thaw_scripts_relative_to_topdir/?updated)).
"""]]

View file

@ -0,0 +1,5 @@
The idea is stemmed from discussions/problems with using freeze/thaw hooks, and in particular [line of thinking in the comment on specify_freeze__47__thaw_scripts_relative_to_topdir](https://git-annex.branchable.com/todo/specify_freeze__47__thaw_scripts_relative_to_topdir/#comment-c71b25bbd0e3f018e07812965bd6a5b1). ATM `git-annex` does analysis of either repository needs any special handling (adjusted branch, pidlock, etc) during `annex init` and otherwise does not bother. It would make sense to make it possible for a user to also be able similarly to
- test e.g. if custom freeze/thaw are needed (before even git annex decides to switch to adjusted branches mode) and setup that repo accordingly for git-annex to proceed without flipping out into adjusted branch mode
- may be improved/custom pidlock detection (on one of my servers I remember needing to just hardcode in ~/.gitconfig to use pidlock although that was relevant only for some paths).
- similarly do some other testing which could allow or disallow some git-annex decision such as e.g. use of adjusted unlocked branch

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 1"
date="2025-01-07T16:09:31Z"
content="""
I even wonder if there could be some easy way to setup overall hook for `git-annex` invocation within a repo (hence within `.git/` of a repo) to e.g. safe-guard invocations of git-annex and prevent use within container environments (we often run into various problems). So smells to me like developing some kind of `.git/annex/hooks/` support analog to `.git/hooks`. Then whatever global `annex init` hook would setup for a repo within its `.git/annex/hooks` would be in effect for subsequent `annex` invocations within containers or natively.
Other desired use-cases could be to prevent invocation of git-annex with known critical for that deployment defects/absent features (e.g. absent support of freeze/thaw scripts or some other recent feature).
"""]]

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="Doable8234"
avatar="http://cdn.libravatar.org/avatar/b0d5fea745f92c3b8cc8ecc3dafa6278"
subject="comment 6"
date="2025-01-07T02:11:33Z"
content="""
Joey, I recently came across this same usecase. There are some intermediate files I store using git annex safely in the cloud and I want to fetch it.
Doing a `git annex get` and a drop seems like the wrong solution. Why am I unnecessarily adding risk when I know I don't care about whether the file currently exists in my repo? I then have to think about various cases like if I already had the file in my repo or not and be very careful. I can't just do a `git annex get; cat; git annex drop`.
I could use a pull-only-clone of my git annex repo, but that comes with many issues and usage hassles like reconfiguring everything. On top of this, I'd sometimes need to do a `git annex drop --force` in my clones since they may not have access to everything that the main repo does which is even more scary.
Your concerns [here](http://git-annex.branchable.com/todo/git-annex-cat/#comment-8ca717fcdeadb1c2413da1f82d3659c6) make sense to me. However, streaming vs downloading is just an optimization. I'm HAPPY to pay the performance cost which is much better than the safety cost I'm currently facing with my hacky solutions to this problem. All we need (from my meager understanding of git annex internals) is to have the `git annex cat` command download the contents on to a temporary file (in the literal `/tmp` directory) instead of the `annex/objects` directory, and then `cat` that at the end. That's pretty much all I (we?) am asking for.
I do know that you like to do things perfectly and I'm sure there'll be lots of issues with the proposal here that you can see that the rest of us cannot. But that's true of solutions too. Really really hoping you can figure out a solution for this. I'm happy to try and help with the code changes too if that helps. I have never used haskell before but very happy to take that challenge if we can settle on a design.
"""]]

View file

@ -0,0 +1,15 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="may be %dotgit?"
date="2025-01-06T23:38:39Z"
content="""
Original line of my thought was expressed in [this issue on github](https://github.com/dbic/handbook/issues/27).
One of the recent cases which made git-annex \"flip out\" into adjusted branch mode (yet to try to reproduce and follow up on [add_config_var_preventing_adjusted_branch_mode](https://git-annex.branchable.com/projects/datalad/bugs-done/add_config_var_preventing_adjusted_branch_mode/)), which happened when user executed datalad with git-annex inside a singularity container.
To facilitate reproducibility etc, we are aiming to minimize effects of outside elements on execution within container so bind mount only current dataset and transfer only some [git / git-annex settings](https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L88).
We could also check on paths for those scripts and bind mount them too. Also if relying on PATH, we would need somehow to ensure that inside the container PATH would point to them too (might be overridden by container's startup script since after all outside PATH might have little to do with inside -- think about running docker container on OSX).
I think it would have been clean(er) if some initial invocation of current global git-annex freeze/thaw script which would potentially determine either it is needed or not at all (since some partitions might not need it, some need one kind, another - some other one), would instantiate in a given repository a copy of the specific freeze/thaw scripts tandem. But inability to specify relative path hinders that. May be similarly to `%path` , it could have some `%dotgit` or alike variable to point to location of `.git` folder, and our \"freeze/thaw\" installation script populating values like `thawcontent-command = %dotgit/bin-annex/thaw-content %path`? I guess also could simply add treatment of leading `./` to signal being relative to `.git/` folder. Such susbstitution would need to be done once upon reading that config setting per repo, there is no need to sense if script is there or not. Since if not -- it better error out instead of proceeding forward with \"default\" behavior (which seems to be \"switch to adjusted branch\").
"""]]