From 1113caa53efedbe7ab1d98b74010160f20473e8d Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 8 Oct 2019 14:01:12 -0400 Subject: [PATCH] preserve unlocked file mtime when dropping When dropping an unlocked file, preserve its mtime, which avoids git status unncessarily running the clean filter on the file. If the index file has close to the same mtime as a work tree file, git will not trust the index to be up-to-date, and re-runs the clean filter unncessarily. Preserving the mtime when depopulating a pointer file avoids git status doing a little (or maybe a lot) of unncessary work. There are other places that the mtime could be preserved, including other places where pointer files are written perhaps, but also populatePointerFile. But, I don't know of cases where those lead to git status doing unncessary work, so I just fixed the one I'm aware of for now. --- Annex/Content/PointerFile.hs | 17 ++++++++++-- CHANGELOG | 2 ++ ..._bc76561ae4aaf932217c40109a8d04bd._comment | 13 ++++++++++ ..._a4c9ccb93a809aff61ae208d31ba0363._comment | 13 ++++++++++ ..._919fb06b3fc7fe9b54797a805873e3c8._comment | 26 +++++++++++++++++++ ..._b73f9c5603bc69867e1619d871a8abb7._comment | 19 ++++++++++++++ 6 files changed, 88 insertions(+), 2 deletions(-) create mode 100644 doc/bugs/git_keeps_refreshing_index/comment_1_bc76561ae4aaf932217c40109a8d04bd._comment create mode 100644 doc/bugs/git_keeps_refreshing_index/comment_2_a4c9ccb93a809aff61ae208d31ba0363._comment create mode 100644 doc/bugs/git_keeps_refreshing_index/comment_3_919fb06b3fc7fe9b54797a805873e3c8._comment create mode 100644 doc/bugs/git_keeps_refreshing_index/comment_4_b73f9c5603bc69867e1619d871a8abb7._comment diff --git a/Annex/Content/PointerFile.hs b/Annex/Content/PointerFile.hs index 2949ac96b0..2ed0db5ab9 100644 --- a/Annex/Content/PointerFile.hs +++ b/Annex/Content/PointerFile.hs @@ -9,15 +9,20 @@ module Annex.Content.PointerFile where +#if ! defined(mingw32_HOST_OS) +import System.Posix.Files +#else import System.PosixCompat.Files +#endif import Annex.Common import Annex.Perms import Annex.Link import Annex.ReplaceFile import Annex.InodeSentinal -import Utility.InodeCache import Annex.Content.LowLevel +import Utility.InodeCache +import Utility.Touch {- Populates a pointer file with the content of a key. - @@ -48,10 +53,18 @@ populatePointerFile restage k obj f = go =<< liftIO (isPointerFile f) - Does not check if the pointer file is modified. -} depopulatePointerFile :: Key -> FilePath -> Annex () depopulatePointerFile key file = do - mode <- liftIO $ catchMaybeIO $ fileMode <$> getFileStatus file + st <- liftIO $ catchMaybeIO $ getFileStatus file + let mode = fmap fileMode st secureErase file liftIO $ nukeFile file ic <- replaceFile file $ \tmp -> do liftIO $ writePointerFile tmp key mode +#if ! defined(mingw32_HOST_OS) + -- Don't advance mtime; this avoids unncessary re-smudging + -- by git in some cases. + liftIO $ maybe noop + (\t -> touch tmp t False) + (fmap modificationTimeHiRes st) +#endif withTSDelta (liftIO . genInodeCache tmp) maybe noop (restagePointerFile (Restage True) file) ic diff --git a/CHANGELOG b/CHANGELOG index 8770360bf5..71762eb076 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -19,6 +19,8 @@ git-annex (7.20190913) UNRELEASED; urgency=medium it was only updated after 1000 files were checked, which may be more files that are possible to fsck in a given fsck time window. Thanks to Peter Simons for help with analysis of this bug. + * When dropping an unlocked file, preserve its mtime, which avoids + git status unncessarily running the clean filter on the file. -- Joey Hess Thu, 19 Sep 2019 11:11:19 -0400 diff --git a/doc/bugs/git_keeps_refreshing_index/comment_1_bc76561ae4aaf932217c40109a8d04bd._comment b/doc/bugs/git_keeps_refreshing_index/comment_1_bc76561ae4aaf932217c40109a8d04bd._comment new file mode 100644 index 0000000000..3dbdc4bd41 --- /dev/null +++ b/doc/bugs/git_keeps_refreshing_index/comment_1_bc76561ae4aaf932217c40109a8d04bd._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 1""" + date="2019-10-08T16:22:16Z" + content=""" +When using v7 repositories, git will do this when there are unlocked files +that have had their content changed. + +git only displays that message when it thinks its going to have to +do a lot of work (re-smudging a lot of files), so you'd mostly see this +happen in a larger repository that has had a lot of (unlocked) files change +in some way. +"""]] diff --git a/doc/bugs/git_keeps_refreshing_index/comment_2_a4c9ccb93a809aff61ae208d31ba0363._comment b/doc/bugs/git_keeps_refreshing_index/comment_2_a4c9ccb93a809aff61ae208d31ba0363._comment new file mode 100644 index 0000000000..0770a76e32 --- /dev/null +++ b/doc/bugs/git_keeps_refreshing_index/comment_2_a4c9ccb93a809aff61ae208d31ba0363._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2019-10-08T16:37:49Z" + content=""" +I noticed that `git annex drop` of an unlocked file causes the next `git +status` to re-smudge the file. That's surprising because git-annex +internally updates the index using git update-index, so git should not see +any need to revisit it. + +So my guess is you were getting or dropping a lot of unlocked files when +you saw that. +"""]] diff --git a/doc/bugs/git_keeps_refreshing_index/comment_3_919fb06b3fc7fe9b54797a805873e3c8._comment b/doc/bugs/git_keeps_refreshing_index/comment_3_919fb06b3fc7fe9b54797a805873e3c8._comment new file mode 100644 index 0000000000..176c89c133 --- /dev/null +++ b/doc/bugs/git_keeps_refreshing_index/comment_3_919fb06b3fc7fe9b54797a805873e3c8._comment @@ -0,0 +1,26 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2019-10-08T16:43:49Z" + content=""" +Dropping a single unlocked file and then running git +status with `GIT_TRACE=1` replicates the additional smudge every time. + +The index before and after git status smudging have identical content, so +git-annex seems to have updated it correctly already. It seems that the +mtime of the index file is causing git to do that one additional smudge. + +Eg, the file git-annex dropped had a mtime of 13:02:43.716964483. The +~/index file that git-annex generated has a mtime of 13:02:43.752964130 +which is newer, but only by a fraction of a second. So, git probably +assumes the mtimes are sufficiently close that it can't trust that the +index file really reflects the current content of the work tree file. And +so it re-smudges the work tree file unncessarily. + +I have not been able to find a number of files to drop that replicates +the bug report. When a lot of files are dropped, it starts taking +sufficiently long to update the index file that it ends up with a newer +timestamp, which avoids the unncessary additional smudging. The worse +case I have found here is dropping 9 files causes all 9 to get re-smudged, +but that's not enough to get git to use the progress display. +"""]] diff --git a/doc/bugs/git_keeps_refreshing_index/comment_4_b73f9c5603bc69867e1619d871a8abb7._comment b/doc/bugs/git_keeps_refreshing_index/comment_4_b73f9c5603bc69867e1619d871a8abb7._comment new file mode 100644 index 0000000000..104e325c3c --- /dev/null +++ b/doc/bugs/git_keeps_refreshing_index/comment_4_b73f9c5603bc69867e1619d871a8abb7._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2019-10-08T17:32:52Z" + content=""" +Anyway, if git-annex could preserve the mtime of an unlocked file when +writing its pointer file or when populating it with content, that would +avoid the unncessary smudging. (Which seems better than adding a delay when +updating the index file, or setting the index's mtime ahead of the current +time..) + +That's easier done for pointer files than populated files, because +with annex.thin the content is a hard link and it would probably not be +good to change its mtime. + +For now, I didn't do it extensively, but only in depopulatePointerFile. +That was enough to eliminate the unncessary smudging after drop that I was +seeing. +"""]]