From a055cb76caf22f9a5d3e4847bf021a28b6d6e7f0 Mon Sep 17 00:00:00 2001 From: "kolam@976e5fa601b60de70b53dad291714218fd749169" Date: Sat, 2 Dec 2023 18:16:00 +0000 Subject: [PATCH 1/8] --- ...__t_access_file_from_secondary_client.mdwn | 156 ++++++++++++++++++ 1 file changed, 156 insertions(+) create mode 100644 doc/forum/Can__39__t_access_file_from_secondary_client.mdwn diff --git a/doc/forum/Can__39__t_access_file_from_secondary_client.mdwn b/doc/forum/Can__39__t_access_file_from_secondary_client.mdwn new file mode 100644 index 0000000000..a7092c2b90 --- /dev/null +++ b/doc/forum/Can__39__t_access_file_from_secondary_client.mdwn @@ -0,0 +1,156 @@ +I'm trying to setup git-annex for syncing two clients using a transfer repository. All of that without the webapp UI. + +Here's the reproducible scenario with a bash script: + +```bash +#/usr/bin/env bash + +# Just a way to access the script's directory +cd "$(dirname "$0")" +DIR="$(pwd)" + +# Create the 1st client repository +mkdir $DIR/client1 +cd $DIR/client1 +git init && git annex init + +# Create the 2nd client repository +mkdir $DIR/client2 +cd $DIR/client2 +git init && git annex init + +# Create the transfer repository +mkdir $DIR/share +cd $DIR/share +git init && git annex init + +# Setup the remotes and groups for the transfer repository +cd $DIR/share +git remote add client1 $DIR/client1 +git remote add client2 $DIR/client1 +git annex group . transfer +git annex group client1 client +git annex group client2 client +git co -b main + +# Setup the remotes and groups for the 1st client repository. +cd $DIR/client1 +git remote add share $DIR/share +git annex group . client +git annex group share transfer +git co -b main + +# Setup the remotes and groups for the 2nd client repository. +cd $DIR/client2 +git remote add share $DIR/share +git annex group . client +git annex group share transfer +git co -b main + +# Run git-annex assistant for each repository +cd $DIR/client1 && git annex assistant +cd $DIR/client2 && git annex assistant +cd $DIR/share && git annex assistant + +# Add a single file to the 1st client. +cd $DIR/client1 +echo "My first file" >> file.txt +``` + +Result: + +client1: I see the auto-commit has been added for file.txt + +share: I get the following daemon logs: + +``` +(scanning...) (started...) +From /home/xxx/git-annex-scenarios/share-between-clients/client1 + * [new branch] git-annex -> client2/git-annex +(merging client2/git-annex into git-annex...) +From /home/xxx/git-annex-scenarios/share-between-clients/client1 + * [new branch] git-annex -> client1/git-annex + +merge: refs/remotes/client2/main - not something we can merge + +merge: refs/remotes/client2/synced/main - not something we can merge + +merge: refs/remotes/client1/main - not something we can merge + +merge: refs/remotes/client1/synced/main - not something we can merge +(merging synced/git-annex into git-annex...) +(recording state in git...) + +``` + +client2: I get the following daemon logs: + +``` +From /home/xxx/git-annex-scenarios/share-between-clients/share + * [new branch] git-annex -> share/git-annex +(merging share/git-annex into git-annex...) +(recording state in git...) + +merge: refs/remotes/share/main - not something we can merge + +merge: refs/remotes/share/synced/main - not something we can merge + +``` + +Then, I thought that maybe I needed to do an initial `git pull` for each repository. So I tried adding to the bash script the following lines: + +```bash +# Need to do this if there are no commits in the 'client2' and 'share' repositories. +# Or else, I'll get the following logs: +# +# merge: refs/remotes/share/main - not something we can merge +# merge: refs/remotes/share/synced/main - not something we can merge +sleep 3; +cd $DIR/share +git pull client1 main +sleep 3; +cd $DIR/client2 +git pull share main +``` + +But I'm still getting the same error: + +``` +(scanning...) (started...) +From /home/xxx/git-annex-scenarios/share-between-clients/share + * [new branch] git-annex -> share/git-annex +(merging share/git-annex into git-annex...) +(recording state in git...) + +merge: refs/remotes/share/main - not something we can merge + +merge: refs/remotes/share/synced/main - not something we can merge +(recording state in git...) +To /home/kolam/git-annex-scenarios/share-between-clients/share + + 28079ec...ca3c481 git-annex -> synced/git-annex (forced update) +Everything up-to-date +To /home/kolam/git-annex-scenarios/share-between-clients/share + + 28079ec...ca3c481 git-annex -> synced/git-annex (forced update) +``` + +However, even though I have that error, `file.txt` now appears in `client2`. +But, the content of `file.txt` is: + +``` +/annex/objects/SHA256E-s14--14b99b7ab1e9777f7e1c2b482fe2cd95653c7cf35f +459ef0b15bd0d75b2245c9.txt +``` + +and that link doesn't exist in my filesystem. +Running `git annex whereis file.txt` in `client2` gives me: + +``` +whereis file.txt (0 copies) failed +whereis: 1 failed +``` + +So my questions are: + +* did I miss something in the steps required to setup the repositories? +* is there some documentation outlining the steps to do so without the webapp? +* how can we enhance the UX for that scenario with better messages? From 98a0623ab6a065d0fffa17605b28f78bfe27885e Mon Sep 17 00:00:00 2001 From: "kolam@976e5fa601b60de70b53dad291714218fd749169" Date: Sat, 2 Dec 2023 19:06:00 +0000 Subject: [PATCH 2/8] rename forum/Can__39__t_access_file_from_secondary_client.mdwn to forum/client_repositories_setup_problem.mdwn --- ...condary_client.mdwn => client_repositories_setup_problem.mdwn} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename doc/forum/{Can__39__t_access_file_from_secondary_client.mdwn => client_repositories_setup_problem.mdwn} (100%) diff --git a/doc/forum/Can__39__t_access_file_from_secondary_client.mdwn b/doc/forum/client_repositories_setup_problem.mdwn similarity index 100% rename from doc/forum/Can__39__t_access_file_from_secondary_client.mdwn rename to doc/forum/client_repositories_setup_problem.mdwn From 92f37d0d49b58b689a9e0dc2f63838772c8534fe Mon Sep 17 00:00:00 2001 From: kdm9 Date: Sun, 3 Dec 2023 10:16:43 +0000 Subject: [PATCH 3/8] new pidlock bug --- ...dition_or_double-locking_with_pidlock.mdwn | 57 +++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn diff --git a/doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn b/doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn new file mode 100644 index 0000000000..db93536ffb --- /dev/null +++ b/doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn @@ -0,0 +1,57 @@ +### Please describe the problem. + +When doing `git annex sync` with new changes from a remote (i.e. synced/main and/or some_remote/main is ahead of our main), git annex seems to try and lock at least two things/times. With pidlock, this of course isn't possible, so somewhere around a `git merge`, we get the following error: + +``` + waiting for pid lock file .git/annex/pidlock which is held by another process (or may be stale) +``` + +When I inspect the content of the pidlock, the `git-annex-sync` process has the lock. + +Manually running `git merge ` and then `git annex sync` doesn't have this issue, so it seems related to merging changes to the main branch (not the git-annnex branch). + +### What steps will reproduce the problem? + +I've really struggled to find a minimal reproducer, but I've hit this bug with several large real-world repos (@joeyh, I would be more than happy to give private access to one of these if you think it would be useful for debugging) + +The latest time this happened, this was the full log: + +``` +$ git annex sync +pull origin + +Updating 130dffc63..f8889be0c + waiting for pid lock file .git/annex/pidlock which is held by another process (or may be stale) +#### hangs indefinitely ###### +^C +$ git merge origin/main +Updating 130dffc63..f8889be0c +Updating files: 100% (223/223), done. +Fast-forward + .gitignore + +$ git annex sync +# merges git-annex branch and pushes to all remotes successfully +``` + +Sometimes, but not always, it seems that a git merge updates the files on disk, but not the git index, leading to an inconsistent state where I have the working tree of the latest commit, but git believes I'm still on the older HEAD and shows the diff as unstaged changes. In these cases one must `git reset --hard HEAD && git clean -df` to clear the state back to HEAD, and then git merge manually, and only then will git annex sync behave as expected. + +### What version of git-annex are you using? On what operating system? + +This issue seems to only exist on versions 10.xxxx, and I remember first running into this a bit over a year ago (I first assumed that it was user error, but I've since had it occur quite a few tim es where it can't be, e.g. freshly logging into a server that was just restarted). At least the following versions are affected: + +* git-annex version: 10.20220526-gc6b112108 +* git-annex version: 10.20230803-gb2887edc9 +* git-annex version: 10.20230926-g44a7b4c9734adfda5912dd82c1aa97c615689f57 + +This is on various linuxes, mostly a few years old as these are institutional supercomputing clusters (ubuntu 20.04, debian 10, SLES 15.4). + +### Please provide any additional information below. + +This only affects clones with pidlock enabled (on compute clusters with NFS filesystems), the same repo on a laptop or whatever with a standard local filesystem (e.g. ext4, xfs) works perfectly. + +Could this be caused by e.g. git annex running git merge which runs git annex filterprocess (directly or via git status), and git-annex-filterprocess tries to take the pidlock that git-annex-sync already has? + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + +Lots! This problem popped up during our regular use of git-annex in plant genomic research, where we use git annex to manage and move our analyses between the many clusters we must use for computation. Git annex is indispensable for this use case!! From 9eee11d7a855f612f84fc2a6a0543e6da1266f96 Mon Sep 17 00:00:00 2001 From: branch Date: Sun, 3 Dec 2023 11:57:56 +0000 Subject: [PATCH 4/8] Added a comment --- ...2_78b89867bd1af78a91826022651a57ad._comment | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 doc/forum/git-remote-gcrypt_and_rsyncd/comment_2_78b89867bd1af78a91826022651a57ad._comment diff --git a/doc/forum/git-remote-gcrypt_and_rsyncd/comment_2_78b89867bd1af78a91826022651a57ad._comment b/doc/forum/git-remote-gcrypt_and_rsyncd/comment_2_78b89867bd1af78a91826022651a57ad._comment new file mode 100644 index 0000000000..57057f4397 --- /dev/null +++ b/doc/forum/git-remote-gcrypt_and_rsyncd/comment_2_78b89867bd1af78a91826022651a57ad._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="branch" + subject="comment 2" + date="2023-12-03T11:57:56Z" + content=""" +There is no specific log to highlight when running the command in `--debug`. + +``` +[2023-12-03 12:43:49.274023] (Utility.Process) process [40369] done ExitSuccess + +git-annex: git: createProcess: chdir: invalid argument (Bad file descriptor) +failed +[2023-12-03 12:43:49.276644] (Utility.Process) process [40197] done ExitSuccess +initremote: 1 failed +``` + +I ended up refactoring my systems to allow the use of SSH, which seems to be the supported method, and to avoid any further issue down the line. +"""]] From a0540498b45a75255e0f7a5491aa5ef399583afe Mon Sep 17 00:00:00 2001 From: Atemu Date: Sun, 3 Dec 2023 21:11:19 +0000 Subject: [PATCH 5/8] Added a comment --- .../comment_4_c3130a2595fc35525dfdbcc6cec57713._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/bugs/__96__git_annex_info__96___hangs_with_git_special_remote/comment_4_c3130a2595fc35525dfdbcc6cec57713._comment diff --git a/doc/bugs/__96__git_annex_info__96___hangs_with_git_special_remote/comment_4_c3130a2595fc35525dfdbcc6cec57713._comment b/doc/bugs/__96__git_annex_info__96___hangs_with_git_special_remote/comment_4_c3130a2595fc35525dfdbcc6cec57713._comment new file mode 100644 index 0000000000..12a0352361 --- /dev/null +++ b/doc/bugs/__96__git_annex_info__96___hangs_with_git_special_remote/comment_4_c3130a2595fc35525dfdbcc6cec57713._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="Atemu" + avatar="http://cdn.libravatar.org/avatar/86b8c2d893dfdf2146e1bbb8ac4165fb" + subject="comment 4" + date="2023-12-03T21:11:19Z" + content=""" +I'd flip that around; make `--fast` the default and add a `--full` flag to show full info. I rarely need it. +"""]] From 4e7f4441bc9f110e89fe24b089beabb1ddc1938f Mon Sep 17 00:00:00 2001 From: "brendan.ward@a2e11ad27f6b2fa2c556aea6811496e0d95dd0da" Date: Mon, 4 Dec 2023 06:41:28 +0000 Subject: [PATCH 6/8] --- doc/forum/annex.largefile_not_working.mdwn | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 doc/forum/annex.largefile_not_working.mdwn diff --git a/doc/forum/annex.largefile_not_working.mdwn b/doc/forum/annex.largefile_not_working.mdwn new file mode 100644 index 0000000000..6a03833434 --- /dev/null +++ b/doc/forum/annex.largefile_not_working.mdwn @@ -0,0 +1,20 @@ +I seem to be having issues with annex.largefiles. I initialize git and the annex, then I set largefiles to put everything in the annex, generate a 1Mb file, add it, and commit it. The file is copied and renamed to its hash value in .git/annex/objects but the file also remains in the main directory instead of being replaced with a symlink. Here are my steps to create the issue: + + git init + git annex init + git annex config --set annex.largefiles anything + fallocate -l 1M test.bin + git add test.bin + git commit -a -m "Test" + +I've also tried creating a .gitattributes file in the main directory with the following attribute: + + * annex.largefiles=anything + +Still, nothing is symlinked. + +It works just fine when I run `git annex add test.bin`. It puts the file in the annex and creates a symlink to it. + +I've tried this on Fedora 39 with git annex version 10.20230626 and on Ubuntu 22.04.2 LTS with git annex version 8.20210223. These are both fresh machines that have never had git or git-annex run on them before. + +What am I doing wrong here? Should I be filing a bug report? From 49374fd9c630414ad540786ae24cb6c2c908ab47 Mon Sep 17 00:00:00 2001 From: "brendan.ward@a2e11ad27f6b2fa2c556aea6811496e0d95dd0da" Date: Mon, 4 Dec 2023 06:43:03 +0000 Subject: [PATCH 7/8] --- doc/forum/annex.largefile_not_working.mdwn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/forum/annex.largefile_not_working.mdwn b/doc/forum/annex.largefile_not_working.mdwn index 6a03833434..d15de6b022 100644 --- a/doc/forum/annex.largefile_not_working.mdwn +++ b/doc/forum/annex.largefile_not_working.mdwn @@ -1,4 +1,4 @@ -I seem to be having issues with annex.largefiles. I initialize git and the annex, then I set largefiles to put everything in the annex, generate a 1Mb file, add it, and commit it. The file is copied and renamed to its hash value in .git/annex/objects but the file also remains in the main directory instead of being replaced with a symlink. Here are my steps to create the issue: +I seem to be having issues with annex.largefiles. I initialize git and the annex, then I set largefiles to put everything in the annex, generate a 1Mb file, `git add` it, and commit it. The file is copied and renamed to its hash value in .git/annex/objects but the file also remains in the main directory instead of being replaced with a symlink. Here are my steps to create the issue: git init git annex init From 39fed072892a7c36a2c179c933fc6c3f993d54ad Mon Sep 17 00:00:00 2001 From: kdm9 Date: Mon, 4 Dec 2023 10:09:16 +0000 Subject: [PATCH 8/8] Added a comment --- ...mment_1_b2ecb8b60603929bae91c3007817585f._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/annex.largefile_not_working/comment_1_b2ecb8b60603929bae91c3007817585f._comment diff --git a/doc/forum/annex.largefile_not_working/comment_1_b2ecb8b60603929bae91c3007817585f._comment b/doc/forum/annex.largefile_not_working/comment_1_b2ecb8b60603929bae91c3007817585f._comment new file mode 100644 index 0000000000..200a8dcdce --- /dev/null +++ b/doc/forum/annex.largefile_not_working/comment_1_b2ecb8b60603929bae91c3007817585f._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="kdm9" + avatar="http://cdn.libravatar.org/avatar/b7b736335a0e9944a8169a582eb4c43d" + subject="comment 1" + date="2023-12-04T10:09:15Z" + content=""" +I think this is intended behavior when adding with `git add`, or at least it's what I've seen for long enough for me to have forgotten if it ever was different. `git annex add` will create symlinks, as will `git add && git annex lock`. + +If this was actually a small file, you wouldn't see it hashed & copied under .git/annex/objects. You should also see in git log that the change is an addition of some git annex key, not a git blob diff as would be the case for a small file. + +NB: I'm just another user, @joey please correct me if this is wrong +"""]]