better index file refresh method
Use git update-index --refresh, since it's a little bit more efficient and the user can be told to run it if a locked index prevents git-annex from running it. This also fixes the problem where an annexed file was deleted in the index and a get of another file that uses the same key caused the index update to add back the deleted file. update-index will not add back the deleted file. Documented in tips/unlocked_files.mdwn the gotcha that the index update may conflict with other operations. I can't see any way to possibly avoid that conflict. One new todo about a race that causes a modification to be accidentially staged. Note that the assistant only flushes the git command queue when it commits a modification. I have not tested the assistant with v6 unlocked files, but assume most users of the assistant won't care if the index shows a file as modified for a while. This commit was supported by the NSF-funded DataLad project.
This commit is contained in:
parent
5e87389f40
commit
82cfcfc838
4 changed files with 69 additions and 46 deletions
|
@ -24,7 +24,6 @@ module Annex.Content (
|
||||||
checkDiskSpace,
|
checkDiskSpace,
|
||||||
needMoreDiskSpace,
|
needMoreDiskSpace,
|
||||||
moveAnnex,
|
moveAnnex,
|
||||||
Restage(..),
|
|
||||||
populatePointerFile,
|
populatePointerFile,
|
||||||
linkToAnnex,
|
linkToAnnex,
|
||||||
linkFromAnnex,
|
linkFromAnnex,
|
||||||
|
@ -597,26 +596,11 @@ populatePointerFile restage k obj f = go =<< liftIO (isPointerFile f)
|
||||||
ifM (linkOrCopy k obj f destmode)
|
ifM (linkOrCopy k obj f destmode)
|
||||||
( do
|
( do
|
||||||
thawContent f
|
thawContent f
|
||||||
restagePointerFile restage k f destmode
|
restagePointerFile restage f
|
||||||
, liftIO $ writePointerFile f k destmode
|
, liftIO $ writePointerFile f k destmode
|
||||||
)
|
)
|
||||||
go _ = return ()
|
go _ = return ()
|
||||||
|
|
||||||
newtype Restage = Restage Bool
|
|
||||||
|
|
||||||
{- Re-stages a pointer file. This is used after updating a worktree file
|
|
||||||
- when content is added/removed, to prevent git from treating the worktree
|
|
||||||
- file as modified.
|
|
||||||
-
|
|
||||||
- If the index is known to be locked (eg, git add has run git-annex),
|
|
||||||
- the staging would fail, and Restage False will prevent it.
|
|
||||||
-}
|
|
||||||
restagePointerFile :: Restage -> Key -> FilePath -> Maybe FileMode -> Annex ()
|
|
||||||
restagePointerFile (Restage False) _ _ _ = return ()
|
|
||||||
restagePointerFile (Restage True) k f mode = do
|
|
||||||
pointersha <- hashPointerFile k
|
|
||||||
stagePointerFile f mode pointersha
|
|
||||||
|
|
||||||
data LinkAnnexResult = LinkAnnexOk | LinkAnnexFailed | LinkAnnexNoop
|
data LinkAnnexResult = LinkAnnexOk | LinkAnnexFailed | LinkAnnexNoop
|
||||||
|
|
||||||
{- Populates the annex object file by hard linking or copying a source
|
{- Populates the annex object file by hard linking or copying a source
|
||||||
|
@ -856,7 +840,7 @@ removeAnnex (ContentRemovalLock key) = withObjectLoc key remove removedirect
|
||||||
secureErase file
|
secureErase file
|
||||||
liftIO $ nukeFile file
|
liftIO $ nukeFile file
|
||||||
liftIO $ writePointerFile file key mode
|
liftIO $ writePointerFile file key mode
|
||||||
restagePointerFile (Restage True) key file mode
|
restagePointerFile (Restage True) file
|
||||||
-- Modified file, so leave it alone.
|
-- Modified file, so leave it alone.
|
||||||
-- If it was a hard link to the annex object,
|
-- If it was a hard link to the annex object,
|
||||||
-- that object might have been frozen as part of the
|
-- that object might have been frozen as part of the
|
||||||
|
|
|
@ -7,7 +7,7 @@
|
||||||
-
|
-
|
||||||
- Pointer files are used instead of symlinks for unlocked files.
|
- Pointer files are used instead of symlinks for unlocked files.
|
||||||
-
|
-
|
||||||
- Copyright 2013-2015 Joey Hess <id@joeyh.name>
|
- Copyright 2013-2018 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU GPL version 3 or higher.
|
- Licensed under the GNU GPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -131,6 +131,41 @@ writePointerFile file k mode = do
|
||||||
writeFile file (formatPointer k)
|
writeFile file (formatPointer k)
|
||||||
maybe noop (setFileMode file) mode
|
maybe noop (setFileMode file) mode
|
||||||
|
|
||||||
|
newtype Restage = Restage Bool
|
||||||
|
|
||||||
|
{- Restage pointer file. This is used after updating a worktree file
|
||||||
|
- when content is added/removed, to prevent git status from showing
|
||||||
|
- it as modified.
|
||||||
|
-
|
||||||
|
- Asks git to refresh its index information for the file.
|
||||||
|
- That in turn runs the clean filter on the file; when the clean
|
||||||
|
- filter produces the same pointer that was in the index before, git
|
||||||
|
- realizes that the file has not actually been modified.
|
||||||
|
-
|
||||||
|
- Note that, if the pointer file is staged for deletion, or has different
|
||||||
|
- content than the current worktree content staged, this won't change
|
||||||
|
- that. So it's safe to call at any time and any situation.
|
||||||
|
-
|
||||||
|
- If the index is known to be locked (eg, git add has run git-annex),
|
||||||
|
- that would fail. Restage False will prevent the index being updated.
|
||||||
|
- Will display a message to help the user understand why
|
||||||
|
- the file will appear to be modified.
|
||||||
|
-
|
||||||
|
- This uses the git queue, so the update is not performed immediately,
|
||||||
|
- and this can be run multiple times cheaply.
|
||||||
|
-}
|
||||||
|
restagePointerFile :: Restage -> FilePath -> Annex ()
|
||||||
|
restagePointerFile (Restage False) f = toplevelWarning True $ unwords
|
||||||
|
[ "git status will show " ++ f
|
||||||
|
, "to be modified, since its content availability has changed."
|
||||||
|
, "This is only a cosmetic problem affecting git status; git add,"
|
||||||
|
, "git commit, etc won't be affected."
|
||||||
|
, "To fix the git status display, you can run:"
|
||||||
|
, "git update-index -q --refresh " ++ f
|
||||||
|
]
|
||||||
|
restagePointerFile (Restage True) f =
|
||||||
|
Annex.Queue.addCommand "update-index" [Param "-q", Param "--refresh"] [f]
|
||||||
|
|
||||||
{- Parses a symlink target or a pointer file to a Key.
|
{- Parses a symlink target or a pointer file to a Key.
|
||||||
- Only looks at the first line, as pointer files can have subsequent
|
- Only looks at the first line, as pointer files can have subsequent
|
||||||
- lines. -}
|
- lines. -}
|
||||||
|
|
|
@ -110,6 +110,23 @@ to having them all unlocked, you can do so using `git annex adjust
|
||||||
useful when using filesystems like FAT, and OS's like Windows that don't
|
useful when using filesystems like FAT, and OS's like Windows that don't
|
||||||
support symlinks.
|
support symlinks.
|
||||||
|
|
||||||
|
## index gotchas
|
||||||
|
|
||||||
|
When git-annex gets or drops the content of an unlocked file, it updates
|
||||||
|
the file in git's worktree accordingly. Then it needs to update the index
|
||||||
|
file to reflect the change. Otherwise, `git status` would show the file
|
||||||
|
as modified, even though there are no changes to commit.
|
||||||
|
|
||||||
|
This means that when git-annex is running a command that gets or drops the
|
||||||
|
content of an unlocked file, the index will sometimes be locked. This might
|
||||||
|
prevent you from `git commit` at the same time. Or, if you have a git
|
||||||
|
commit in progress, or are running multiple git-annex processes, git-annex
|
||||||
|
may complain that the index is locked.
|
||||||
|
|
||||||
|
To manually update the index when git-annex was not able to, you can run:
|
||||||
|
|
||||||
|
git update-index -q --refresh $file
|
||||||
|
|
||||||
## using less disk space
|
## using less disk space
|
||||||
|
|
||||||
Unlocked files are handy, but they have one significant disadvantage
|
Unlocked files are handy, but they have one significant disadvantage
|
||||||
|
@ -154,7 +171,7 @@ So, using `git checkout` to check out a different branch, or even
|
||||||
working tree, and using more disk space. A warning will be printed out in
|
working tree, and using more disk space. A warning will be printed out in
|
||||||
this situation. You can always run `git annex fix` to re-thin such files.
|
this situation. You can always run `git annex fix` to re-thin such files.
|
||||||
|
|
||||||
## tradeoffs
|
## annex.thin tradeoffs
|
||||||
|
|
||||||
[[!template id=note text="""
|
[[!template id=note text="""
|
||||||
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
|
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
|
||||||
|
|
|
@ -12,33 +12,20 @@ git-annex should use smudge/clean filters.
|
||||||
# because it doesn't know it has that name
|
# because it doesn't know it has that name
|
||||||
# git commit clears up this mess
|
# git commit clears up this mess
|
||||||
|
|
||||||
* If an unlocked file's content is not present, and a new file with
|
* If the user is getting a file that was not present, and at the same
|
||||||
identical content is added with `git add`, the unlocked file is
|
time overwrites the file with new content, the new content can be staged
|
||||||
populated, but git-annex is unable to update the index, so git status
|
accidentially when git-annex runs git update-index on the file.
|
||||||
will say that it has been modified.
|
|
||||||
|
|
||||||
* If an annexed file is deleted in the index, and another annexed file
|
|
||||||
uses the same key, and git annex get/drop is run, the index update
|
|
||||||
that's done to prevent status showing the file as modified adds
|
|
||||||
the deleted file back to the index.
|
|
||||||
|
|
||||||
* Also, if the user is getting files, and modifying files at the same
|
|
||||||
time, and they stage their modifications, the modification may get
|
|
||||||
unstaged in a race when a file is got and the updated worktree file
|
|
||||||
staged in the index.
|
|
||||||
|
|
||||||
I don't know if this is worth worrying about,
|
This race's window is wide because git-annex will process annex.queuesize
|
||||||
because there's also of course a race where the modification to the
|
files before updating the index. It could be narrowed by running
|
||||||
worktree file may get reverted when git-annex updates the content. Those
|
update-index more frequently. Or, could check for modified files before
|
||||||
races are much smaller, but do exist.
|
running it and throw those out, which would narrow the window a lot,
|
||||||
|
but not eliminate the race entirely.
|
||||||
* get/drop operations on unlocked files lead to an update of the index.
|
|
||||||
Only one process can update the index at one time, so eg, git annex get
|
(Of course there's also a race where the modification gets overwritten
|
||||||
at the same time as a git commit may display a ugly warning
|
by git-annex when it updates the worktree. Which is much like the race
|
||||||
(or the git commit could fail to start if run at just the right time).
|
that git checkout/pull/merge can overwite a modification, which is small
|
||||||
|
and unlikely but afaik unclosable.)
|
||||||
Two git-annex get processes can also try to update the index at the
|
|
||||||
same time and encounter this problem (git annex get -J is ok).
|
|
||||||
|
|
||||||
* Potentially: Use git's new `filter.<driver>.process` interface, which will
|
* Potentially: Use git's new `filter.<driver>.process` interface, which will
|
||||||
let only 1 git-annex process be started by git when processing
|
let only 1 git-annex process be started by git when processing
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue