Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
458b3d8e52
6 changed files with 271 additions and 0 deletions
57
doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn
Normal file
57
doc/bugs/Race_condition_or_double-locking_with_pidlock.mdwn
Normal file
|
@ -0,0 +1,57 @@
|
|||
### Please describe the problem.
|
||||
|
||||
When doing `git annex sync` with new changes from a remote (i.e. synced/main and/or some_remote/main is ahead of our main), git annex seems to try and lock at least two things/times. With pidlock, this of course isn't possible, so somewhere around a `git merge`, we get the following error:
|
||||
|
||||
```
|
||||
waiting for pid lock file .git/annex/pidlock which is held by another process (or may be stale)
|
||||
```
|
||||
|
||||
When I inspect the content of the pidlock, the `git-annex-sync` process has the lock.
|
||||
|
||||
Manually running `git merge <ref that is ahead>` and then `git annex sync` doesn't have this issue, so it seems related to merging changes to the main branch (not the git-annnex branch).
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
I've really struggled to find a minimal reproducer, but I've hit this bug with several large real-world repos (@joeyh, I would be more than happy to give private access to one of these if you think it would be useful for debugging)
|
||||
|
||||
The latest time this happened, this was the full log:
|
||||
|
||||
```
|
||||
$ git annex sync
|
||||
pull origin
|
||||
|
||||
Updating 130dffc63..f8889be0c
|
||||
waiting for pid lock file .git/annex/pidlock which is held by another process (or may be stale)
|
||||
#### hangs indefinitely ######
|
||||
^C
|
||||
$ git merge origin/main
|
||||
Updating 130dffc63..f8889be0c
|
||||
Updating files: 100% (223/223), done.
|
||||
Fast-forward
|
||||
.gitignore
|
||||
<and then quite a large diff, including many files created/deleted>
|
||||
$ git annex sync
|
||||
# merges git-annex branch and pushes to all remotes successfully
|
||||
```
|
||||
|
||||
Sometimes, but not always, it seems that a git merge updates the files on disk, but not the git index, leading to an inconsistent state where I have the working tree of the latest commit, but git believes I'm still on the older HEAD and shows the diff as unstaged changes. In these cases one must `git reset --hard HEAD && git clean -df` to clear the state back to HEAD, and then git merge manually, and only then will git annex sync behave as expected.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
This issue seems to only exist on versions 10.xxxx, and I remember first running into this a bit over a year ago (I first assumed that it was user error, but I've since had it occur quite a few tim es where it can't be, e.g. freshly logging into a server that was just restarted). At least the following versions are affected:
|
||||
|
||||
* git-annex version: 10.20220526-gc6b112108
|
||||
* git-annex version: 10.20230803-gb2887edc9
|
||||
* git-annex version: 10.20230926-g44a7b4c9734adfda5912dd82c1aa97c615689f57
|
||||
|
||||
This is on various linuxes, mostly a few years old as these are institutional supercomputing clusters (ubuntu 20.04, debian 10, SLES 15.4).
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
This only affects clones with pidlock enabled (on compute clusters with NFS filesystems), the same repo on a laptop or whatever with a standard local filesystem (e.g. ext4, xfs) works perfectly.
|
||||
|
||||
Could this be caused by e.g. git annex running git merge which runs git annex filterprocess (directly or via git status), and git-annex-filterprocess tries to take the pidlock that git-annex-sync already has?
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
Lots! This problem popped up during our regular use of git-annex in plant genomic research, where we use git annex to manage and move our analyses between the many clusters we must use for computation. Git annex is indispensable for this use case!!
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="Atemu"
|
||||
avatar="http://cdn.libravatar.org/avatar/86b8c2d893dfdf2146e1bbb8ac4165fb"
|
||||
subject="comment 4"
|
||||
date="2023-12-03T21:11:19Z"
|
||||
content="""
|
||||
I'd flip that around; make `--fast` the default and add a `--full` flag to show full info. I rarely need it.
|
||||
"""]]
|
20
doc/forum/annex.largefile_not_working.mdwn
Normal file
20
doc/forum/annex.largefile_not_working.mdwn
Normal file
|
@ -0,0 +1,20 @@
|
|||
I seem to be having issues with annex.largefiles. I initialize git and the annex, then I set largefiles to put everything in the annex, generate a 1Mb file, `git add` it, and commit it. The file is copied and renamed to its hash value in .git/annex/objects but the file also remains in the main directory instead of being replaced with a symlink. Here are my steps to create the issue:
|
||||
|
||||
git init
|
||||
git annex init
|
||||
git annex config --set annex.largefiles anything
|
||||
fallocate -l 1M test.bin
|
||||
git add test.bin
|
||||
git commit -a -m "Test"
|
||||
|
||||
I've also tried creating a .gitattributes file in the main directory with the following attribute:
|
||||
|
||||
* annex.largefiles=anything
|
||||
|
||||
Still, nothing is symlinked.
|
||||
|
||||
It works just fine when I run `git annex add test.bin`. It puts the file in the annex and creates a symlink to it.
|
||||
|
||||
I've tried this on Fedora 39 with git annex version 10.20230626 and on Ubuntu 22.04.2 LTS with git annex version 8.20210223. These are both fresh machines that have never had git or git-annex run on them before.
|
||||
|
||||
What am I doing wrong here? Should I be filing a bug report?
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="kdm9"
|
||||
avatar="http://cdn.libravatar.org/avatar/b7b736335a0e9944a8169a582eb4c43d"
|
||||
subject="comment 1"
|
||||
date="2023-12-04T10:09:15Z"
|
||||
content="""
|
||||
I think this is intended behavior when adding with `git add`, or at least it's what I've seen for long enough for me to have forgotten if it ever was different. `git annex add` will create symlinks, as will `git add && git annex lock`.
|
||||
|
||||
If this was actually a small file, you wouldn't see it hashed & copied under .git/annex/objects. You should also see in git log that the change is an addition of some git annex key, not a git blob diff as would be the case for a small file.
|
||||
|
||||
NB: I'm just another user, @joey please correct me if this is wrong
|
||||
"""]]
|
156
doc/forum/client_repositories_setup_problem.mdwn
Normal file
156
doc/forum/client_repositories_setup_problem.mdwn
Normal file
|
@ -0,0 +1,156 @@
|
|||
I'm trying to setup git-annex for syncing two clients using a transfer repository. All of that without the webapp UI.
|
||||
|
||||
Here's the reproducible scenario with a bash script:
|
||||
|
||||
```bash
|
||||
#/usr/bin/env bash
|
||||
|
||||
# Just a way to access the script's directory
|
||||
cd "$(dirname "$0")"
|
||||
DIR="$(pwd)"
|
||||
|
||||
# Create the 1st client repository
|
||||
mkdir $DIR/client1
|
||||
cd $DIR/client1
|
||||
git init && git annex init
|
||||
|
||||
# Create the 2nd client repository
|
||||
mkdir $DIR/client2
|
||||
cd $DIR/client2
|
||||
git init && git annex init
|
||||
|
||||
# Create the transfer repository
|
||||
mkdir $DIR/share
|
||||
cd $DIR/share
|
||||
git init && git annex init
|
||||
|
||||
# Setup the remotes and groups for the transfer repository
|
||||
cd $DIR/share
|
||||
git remote add client1 $DIR/client1
|
||||
git remote add client2 $DIR/client1
|
||||
git annex group . transfer
|
||||
git annex group client1 client
|
||||
git annex group client2 client
|
||||
git co -b main
|
||||
|
||||
# Setup the remotes and groups for the 1st client repository.
|
||||
cd $DIR/client1
|
||||
git remote add share $DIR/share
|
||||
git annex group . client
|
||||
git annex group share transfer
|
||||
git co -b main
|
||||
|
||||
# Setup the remotes and groups for the 2nd client repository.
|
||||
cd $DIR/client2
|
||||
git remote add share $DIR/share
|
||||
git annex group . client
|
||||
git annex group share transfer
|
||||
git co -b main
|
||||
|
||||
# Run git-annex assistant for each repository
|
||||
cd $DIR/client1 && git annex assistant
|
||||
cd $DIR/client2 && git annex assistant
|
||||
cd $DIR/share && git annex assistant
|
||||
|
||||
# Add a single file to the 1st client.
|
||||
cd $DIR/client1
|
||||
echo "My first file" >> file.txt
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
client1: I see the auto-commit has been added for file.txt
|
||||
|
||||
share: I get the following daemon logs:
|
||||
|
||||
```
|
||||
(scanning...) (started...)
|
||||
From /home/xxx/git-annex-scenarios/share-between-clients/client1
|
||||
* [new branch] git-annex -> client2/git-annex
|
||||
(merging client2/git-annex into git-annex...)
|
||||
From /home/xxx/git-annex-scenarios/share-between-clients/client1
|
||||
* [new branch] git-annex -> client1/git-annex
|
||||
|
||||
merge: refs/remotes/client2/main - not something we can merge
|
||||
|
||||
merge: refs/remotes/client2/synced/main - not something we can merge
|
||||
|
||||
merge: refs/remotes/client1/main - not something we can merge
|
||||
|
||||
merge: refs/remotes/client1/synced/main - not something we can merge
|
||||
(merging synced/git-annex into git-annex...)
|
||||
(recording state in git...)
|
||||
|
||||
```
|
||||
|
||||
client2: I get the following daemon logs:
|
||||
|
||||
```
|
||||
From /home/xxx/git-annex-scenarios/share-between-clients/share
|
||||
* [new branch] git-annex -> share/git-annex
|
||||
(merging share/git-annex into git-annex...)
|
||||
(recording state in git...)
|
||||
|
||||
merge: refs/remotes/share/main - not something we can merge
|
||||
|
||||
merge: refs/remotes/share/synced/main - not something we can merge
|
||||
|
||||
```
|
||||
|
||||
Then, I thought that maybe I needed to do an initial `git pull` for each repository. So I tried adding to the bash script the following lines:
|
||||
|
||||
```bash
|
||||
# Need to do this if there are no commits in the 'client2' and 'share' repositories.
|
||||
# Or else, I'll get the following logs:
|
||||
#
|
||||
# merge: refs/remotes/share/main - not something we can merge
|
||||
# merge: refs/remotes/share/synced/main - not something we can merge
|
||||
sleep 3;
|
||||
cd $DIR/share
|
||||
git pull client1 main
|
||||
sleep 3;
|
||||
cd $DIR/client2
|
||||
git pull share main
|
||||
```
|
||||
|
||||
But I'm still getting the same error:
|
||||
|
||||
```
|
||||
(scanning...) (started...)
|
||||
From /home/xxx/git-annex-scenarios/share-between-clients/share
|
||||
* [new branch] git-annex -> share/git-annex
|
||||
(merging share/git-annex into git-annex...)
|
||||
(recording state in git...)
|
||||
|
||||
merge: refs/remotes/share/main - not something we can merge
|
||||
|
||||
merge: refs/remotes/share/synced/main - not something we can merge
|
||||
(recording state in git...)
|
||||
To /home/kolam/git-annex-scenarios/share-between-clients/share
|
||||
+ 28079ec...ca3c481 git-annex -> synced/git-annex (forced update)
|
||||
Everything up-to-date
|
||||
To /home/kolam/git-annex-scenarios/share-between-clients/share
|
||||
+ 28079ec...ca3c481 git-annex -> synced/git-annex (forced update)
|
||||
```
|
||||
|
||||
However, even though I have that error, `file.txt` now appears in `client2`.
|
||||
But, the content of `file.txt` is:
|
||||
|
||||
```
|
||||
/annex/objects/SHA256E-s14--14b99b7ab1e9777f7e1c2b482fe2cd95653c7cf35f
|
||||
459ef0b15bd0d75b2245c9.txt
|
||||
```
|
||||
|
||||
and that link doesn't exist in my filesystem.
|
||||
Running `git annex whereis file.txt` in `client2` gives me:
|
||||
|
||||
```
|
||||
whereis file.txt (0 copies) failed
|
||||
whereis: 1 failed
|
||||
```
|
||||
|
||||
So my questions are:
|
||||
|
||||
* did I miss something in the steps required to setup the repositories?
|
||||
* is there some documentation outlining the steps to do so without the webapp?
|
||||
* how can we enhance the UX for that scenario with better messages?
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="branch"
|
||||
subject="comment 2"
|
||||
date="2023-12-03T11:57:56Z"
|
||||
content="""
|
||||
There is no specific log to highlight when running the command in `--debug`.
|
||||
|
||||
```
|
||||
[2023-12-03 12:43:49.274023] (Utility.Process) process [40369] done ExitSuccess
|
||||
|
||||
git-annex: git: createProcess: chdir: invalid argument (Bad file descriptor)
|
||||
failed
|
||||
[2023-12-03 12:43:49.276644] (Utility.Process) process [40197] done ExitSuccess
|
||||
initremote: 1 failed
|
||||
```
|
||||
|
||||
I ended up refactoring my systems to allow the use of SSH, which seems to be the supported method, and to avoid any further issue down the line.
|
||||
"""]]
|
Loading…
Reference in a new issue