omit inode from ContentIdentifier for directory special remote
Directory special remotes with importtree=yes now avoid unncessary overhead when inodes of files have changed, as happens whenever a FAT filesystem gets remounted. A few unusual edge cases of modifications won't be detected and imported. I think they're unusual enough not to be a concern. It would be possible to add a config setting that controls whether to compare inodes too, but does not seem worth bothering the user about currently. I chose to continue to use the InodeCache serialization, just with the inode zeroed. This way, if I later change my mind or make it configurable, can parse it back to an InodeCache and operate on it. The overhead of storing a 0 in the content identifier log seems worth it. There is a one-time cost to this change; all directory special remotes with importtree=yes will re-hash all files once, and will update the content identifier logs with zeroed inodes. This commit was sponsored by Brett Eisenberg on Patreon.
This commit is contained in:
parent
7ccddd4aea
commit
73df633a62
5 changed files with 34 additions and 11 deletions
|
@ -1,6 +1,6 @@
|
|||
{- A "remote" that is just a filesystem directory.
|
||||
-
|
||||
- Copyright 2011-2020 Joey Hess <id@joeyh.name>
|
||||
- Copyright 2011-2021 Joey Hess <id@joeyh.name>
|
||||
-
|
||||
- Licensed under the GNU AGPL version 3 or higher.
|
||||
-}
|
||||
|
@ -354,18 +354,21 @@ listImportableContentsM dir = liftIO $ do
|
|||
sz <- getFileSize' f st
|
||||
return $ Just (mkImportLocation relf, (cid, sz))
|
||||
|
||||
-- Make a ContentIdentifier that contains an InodeCache.
|
||||
--
|
||||
-- The InodeCache is generated without checking a sentinal file.
|
||||
-- So in a case when a remount etc causes all the inodes to change,
|
||||
-- files may appear to be modified when they are not, which will only
|
||||
-- result in extra work to re-import them.
|
||||
--
|
||||
-- Make a ContentIdentifier that contains the size and mtime of the file.
|
||||
-- If the file is not a regular file, this will return Nothing.
|
||||
--
|
||||
-- The inode is zeroed because often this is used for import from a
|
||||
-- FAT filesystem, whose inodes change each time it's mounted, and
|
||||
-- including inodes would cause repeated re-hashing of files, and
|
||||
-- bloat the git-annex branch with changes to content identifier logs.
|
||||
--
|
||||
-- This does mean that swaps of two files with the same size and
|
||||
-- mtime won't be noticed, nor will modifications to files that
|
||||
-- preserve the size and mtime. Both very unlikely so acceptable.
|
||||
mkContentIdentifier :: RawFilePath -> FileStatus -> IO (Maybe ContentIdentifier)
|
||||
mkContentIdentifier f st =
|
||||
fmap (ContentIdentifier . encodeBS . showInodeCache)
|
||||
<$> toInodeCache noTSDelta f st
|
||||
<$> toInodeCache' noTSDelta f st 0
|
||||
|
||||
guardSameContentIdentifiers :: a -> ContentIdentifier -> Maybe ContentIdentifier -> a
|
||||
guardSameContentIdentifiers cont old new
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue