analysis, plan
This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
parent
d7f386a81d
commit
8478544b58
1 changed files with 53 additions and 0 deletions
|
@ -0,0 +1,53 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 7"""
|
||||
date="2018-08-27T16:27:47Z"
|
||||
content="""
|
||||
I was able to reproduce what I described in comment #2 with current
|
||||
git-annex.
|
||||
|
||||
Also, I was able to reproduce the fsck failure, by touching the
|
||||
.git/annex/objects file. Even though the file is not modifiable, touching
|
||||
it updates its mtime, and so the inode cache is considered stale.
|
||||
This makes inAnnex think it's not present.
|
||||
|
||||
And another way is, when annex.thin is enabled, to touch the unlocked file,
|
||||
which also touched the .git/annex/objects it's hardlinked to.
|
||||
git-annex lock then checksums it (because the inode cache is stale),
|
||||
concludes it's still good, and so proceeds to lock it, but the stale inode
|
||||
cache again causes inAnnex to think it's not present.
|
||||
|
||||
I still don't know how to reproduce git-annex get redownloading a file that
|
||||
git-annex find lists.
|
||||
|
||||
> Should inAnnex even be checking the inode cache for locked content? This seems unncessary, and note that it's done for v4 mode as well as v6.
|
||||
|
||||
Unnecessary for v4 certianly, but in v6 with annex.thin,
|
||||
unlocked content is hard linked and could be modified,
|
||||
so it does need to check that the inode cache is valid.
|
||||
|
||||
Could remove the inode cache when locking a file, as long as there
|
||||
are no other associated files that are unlocked. That solves the problem
|
||||
for that case, and makes inAnnex avoid some unncessary work too.
|
||||
|
||||
When the same content has a mix of locked and unlocked associated files,
|
||||
the inode cache needs to remain populated (to support annex.thin and so
|
||||
git-annex drop will drop the copies of the content from the locked files).
|
||||
But then if the inode cache for the locked content becomes stale, and
|
||||
the unlocked files get modified, inAnnex will again be wrong.
|
||||
|
||||
So, it seems that inAnnex also needs to check if the annex object
|
||||
has no hard links, and if so, treat it as present even when the inode cache
|
||||
does not match. That's cheaper than checking the inode cache, so it ought
|
||||
to be done first.
|
||||
|
||||
Conclusion: Only check the inode cache for the annex object
|
||||
when annex.thin has made a hard link to it. If its link count is 1,
|
||||
inAnnex knows it's present and locked, so it can assume it's good.
|
||||
|
||||
(Also, it's possible there could be a race when locking a file
|
||||
where the file gets modified after the inode cache is checked,
|
||||
so it's treated as unmodified but is really modified and as a hard link to
|
||||
the object file, the object file has the wrong content. Need to make sure
|
||||
this race can't occur.)
|
||||
"""]]
|
Loading…
Reference in a new issue