some analysis

This commit is contained in:
Joey Hess 2021-06-28 15:00:21 -04:00
parent 26b0895187
commit a492553eca
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 62 additions and 0 deletions

View file

@ -0,0 +1,21 @@
[[!comment format=mdwn
username="joey"
subject="""comment 12"""
date="2021-06-28T18:04:26Z"
content="""
This seems conclusive that the repair is somehow triggering unncessarily
and also corrupting the repo in this situation.
The comment #3 log shows that the repair is started, and then 1 minute
later a git object is missing.
(It's odd that log shows a second fsck run after the repair was already
triggered. I do not see a way that this would happen unless fscks are
scheduled very close together.)
The automatic repair is supposed to be a non-destructive repair; the
destructive repair only happens after prompting in the UI.
This also reminds me of a persistent issue with a git-annex repo, using the
assistant, on my sister's laptop corrupting itself.
"""]]

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="joey"
subject="""comment 13"""
date="2021-06-28T18:13:40Z"
content="""
The repair process moves all pack files to a temp dir and then unpacks the
loose objects from them. So, there is a time window, when the repair is
running, where git objects that were present before will be missing. And if
the assistant stops before that is complete, it would leave it in that
state. Unpacking pack files can take a long time, so this might
be a sufficient explanation.
But then, something must be causing it to incorrectly think it needs
a repair in the first place. Assuming it is incorrect, of course. Either git
fsck is exiting nonzero for some reason, or git-annex is thinking
it sees git fsck complain about a missing object, that is not really
missing. While there are fsck outputs that it can misinterpret, it
double-checks by trying to cat the object, which should avoid the latter
problem.
"""]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="joey"
subject="""comment 14"""
date="2021-06-28T18:28:06Z"
content="""
To avoid moving the pack files, repair could set `GIT_OBJECT_DIRECTORY`
to a temp directory, and copy each pack file into it in turn, and unpack.
And after each unpack, move the unpacked objects from the temp directory to
the real object directory, followed by deleting the pack file (in case it's
corrupt).
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="joey"
subject="""comment 16"""
date="2021-06-28T18:48:32Z"
content="""
Removing repair from the assistant (and git-annex repair) should be on the
table as a solution to this. It's a whole lot of complexity that might fix
a few user's repos sometimes, but is outside of git-annex's scope and is
mostly only used by assistant users.
"""]]