diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment new file mode 100644 index 0000000000..aab4ae2c9c --- /dev/null +++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment @@ -0,0 +1,32 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 1""" + date="2024-05-31T14:52:17Z" + content=""" +Reproduction recipe works, thanks! + +Happens back to 10.20240129 at least, this is not recent breakage. + +There are some interesting things in the git-annex history. +Including some git-annex:export.tree grafting, and also +a continued transition. + +I made a new empty repo, initialized and annexed some files. Running the +same script but cloning that, this problem does not occur. I also tried +exporting a tree in that repo, and still the problem doesn't occur. I even +tried running `git-annex forget` in there and still can't cause the +problem. + +So something about this specific repo's git-annex branch history is +triggering the problem and I don't know what. I've archived the current +state of this repo in my big repo as git-annex-test-repos/ds002144.tar.gz +to make sure I can continue to reproduce this. + +The first git-annex branch commit that is missing its tree object +is a git-annex:export.tree graft commit. That is 3 commits above +the git-annex branch pulled from github: + +Very interesting. Especially since the point of those export.tree graft commits +are to make sure that the exported tree objects are referenced and so don't get +gced out from under us. +"""]] diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment new file mode 100644 index 0000000000..00bfab72fd --- /dev/null +++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment @@ -0,0 +1,54 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2024-05-31T15:26:26Z" + content=""" +Resetting the repo's git-annex branch all the way back to the 1st commit in it +is sufficient to reproduce this bug. + + joey@darkstar:~/tmp/ds002144#main>git log git-annex + commit 2e24112747f3742c5426138def93fd3219574df7 (git-annex) + Author: Git Worker + Date: Fri Jan 19 21:04:18 2024 +0000 + + new branch for transition ["forget git history"] + +Hmm. That ref contains an export.log that references some tree shas. + + 1596548650.649001679s 2606f878-85c6-459a-8402-5f4b98720bbd:58a4efbe-8fb4-4cb3-8be3-b982a4673947 b78b723042e6d7a967c806b52258e8554caa1696 ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e + 1705698180.15617956s 2606f878-85c6-459a-8402-5f4b98720bbd:8af9d961-216f-47ec-b052-31696fc2f12d ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e 28b655e8207f916122bbcbd22c0369d86bb4ffc1 + +Those seem familiar: + + missing tree b78b723042e6d7a967c806b52258e8554caa1696 + missing tree ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e + +So ok.. We have here a transition that forgot git history. But it kept an +export.log that referenced 2 trees in that now-forgotten git history. + +Everything else seems to follow from that. Grafting those trees back into the +git-annex branch in order to not forget them is a bad move since they're +already forgotten. So it could just avoid doing that, if the tree object +is missing, I suppose. + +There might be a deeper bug though: If we want to `git-annex export`, in either +the original repo with forgotten history, or in a clone, it won't be able to +refer to those tree objects. So it won't know what has been written to the +special remote. So eg, if we export a tree that deletes a file compared to one +of these trees, it wouldn't delete the file from the special remote. +I think this problem might not happen when exporting in the original repo, +because there the export database also records the same information. More likely +it will happen in a clone. + +So, action items: + +* When performing a transition, the trees mentioned in export.log needs to be + grafted back in, in order not to lose them. I think it already is supposed to + do that, but it clearly didn't work in this case. So I need to find a way to + reproduce the situation in commit 2e24112747f3742c5426138def93fd3219574df7 in + a new repository to find out why that didn't happen. And fix that. +* When encountering a git-annex branch with this situation in it, avoid + grafting missing trees back into the branch. And probably `git-annex export` + needs to refuse to touch the affected special remote, or warn the user + that it's lost track of what files were sent to the special remote. +"""]] diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment new file mode 100644 index 0000000000..da8af4031c --- /dev/null +++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment @@ -0,0 +1,17 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2024-05-31T15:42:17Z" + content=""" +Occurs to me that one way to get a repository into this situation would be +to do a `git-annex export`, then `git-annex forget`, and then manually +reset the git-annex branch to `git-annex^^` (or similarly push +`git-annex^^` to origin). + +There is a commit after the transition commit that re-grafts the exported +tree back into the git-annex branch, and a manual reset would cause exactly +this situation. + +I doubt OpenNeuro is manually resetting the git-annex branch when creating +these repos, but stranger things have happened... +"""]]