2016-01-11 19:52:11 +00:00
|
|
|
{- Sqlite database of information about Keys
|
|
|
|
-
|
2022-11-18 17:16:57 +00:00
|
|
|
- Copyright 2015-2022 Joey Hess <id@joeyh.name>
|
2016-01-11 19:52:11 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2016-01-11 19:52:11 +00:00
|
|
|
-}
|
|
|
|
|
2020-02-04 17:53:00 +00:00
|
|
|
{-# LANGUAGE CPP #-}
|
2023-08-02 13:47:42 +00:00
|
|
|
{-# LANGUAGE QuasiQuotes, TypeFamilies, TypeOperators, TemplateHaskell #-}
|
2016-01-11 19:52:11 +00:00
|
|
|
{-# LANGUAGE OverloadedStrings, GADTs, FlexibleContexts #-}
|
|
|
|
{-# LANGUAGE MultiParamTypeClasses, GeneralizedNewtypeDeriving #-}
|
|
|
|
{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
|
2020-11-07 18:09:17 +00:00
|
|
|
{-# LANGUAGE DataKinds, FlexibleInstances #-}
|
2019-07-30 16:49:37 +00:00
|
|
|
{-# LANGUAGE UndecidableInstances #-}
|
2020-02-04 17:53:00 +00:00
|
|
|
#if MIN_VERSION_persistent_template(2,8,0)
|
2020-02-04 16:03:30 +00:00
|
|
|
{-# LANGUAGE DerivingStrategies #-}
|
|
|
|
{-# LANGUAGE StandaloneDeriving #-}
|
2020-02-04 17:53:00 +00:00
|
|
|
#endif
|
2016-01-11 19:52:11 +00:00
|
|
|
|
|
|
|
module Database.Keys.SQL where
|
|
|
|
|
|
|
|
import Database.Types
|
2016-01-12 17:01:44 +00:00
|
|
|
import Database.Handle
|
2023-03-31 18:34:18 +00:00
|
|
|
import Database.Utility
|
2016-01-11 19:52:11 +00:00
|
|
|
import qualified Database.Queue as H
|
|
|
|
import Utility.InodeCache
|
|
|
|
import Git.FilePath
|
|
|
|
|
2019-10-29 16:28:01 +00:00
|
|
|
import Database.Persist.Sql hiding (Key)
|
2016-01-11 19:52:11 +00:00
|
|
|
import Database.Persist.TH
|
|
|
|
import Control.Monad
|
2019-10-23 18:06:11 +00:00
|
|
|
import Data.Maybe
|
2016-01-11 19:52:11 +00:00
|
|
|
|
2019-10-30 17:28:00 +00:00
|
|
|
-- Note on indexes: KeyFileIndex etc are really uniqueness constraints,
|
|
|
|
-- which cause sqlite to automatically add indexes. So when adding indexes,
|
|
|
|
-- have to take care to only add ones that work as uniqueness constraints.
|
2023-03-14 02:39:16 +00:00
|
|
|
-- (Unfortunately persistent does not support indexes that are not
|
2019-10-30 17:28:00 +00:00
|
|
|
-- uniqueness constraints; https://github.com/yesodweb/persistent/issues/109)
|
|
|
|
--
|
faster associated file replacement with upsert
Rather than first deleting and then inserting, upsert lets the key
associated with a file be updated in place.
Benchmarked with 100,000 files, and an empty keys database, running
reconcileStaged. It improved from 47 seconds to 34 seconds.
So this got reconcileStaged to be as fast as scanAssociatedFiles,
or faster -- scanAssociatedFiles benchmarks at 37 seconds.
(Also checked for other users of deleteWhere that could be sped up by
upsert. There are a couple, but they are not in performance critical
code paths, eg recordExportTreeCurrent is only run once per tree
export.)
I would have liked to rename FileKeyIndex to FileKeyUnique since it is
being used as a uniqueness constraint now, not just to get an index.
But, that gets converted into part of the SQL schema, and the name
is used by the upsert, so it can't be changed.
Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:09:07 +00:00
|
|
|
-- To speed up queries for a key, there's KeyFileIndex,
|
|
|
|
-- which makes there be a covering index for keys.
|
2019-10-30 17:28:00 +00:00
|
|
|
--
|
faster associated file replacement with upsert
Rather than first deleting and then inserting, upsert lets the key
associated with a file be updated in place.
Benchmarked with 100,000 files, and an empty keys database, running
reconcileStaged. It improved from 47 seconds to 34 seconds.
So this got reconcileStaged to be as fast as scanAssociatedFiles,
or faster -- scanAssociatedFiles benchmarks at 37 seconds.
(Also checked for other users of deleteWhere that could be sped up by
upsert. There are a couple, but they are not in performance critical
code paths, eg recordExportTreeCurrent is only run once per tree
export.)
I would have liked to rename FileKeyIndex to FileKeyUnique since it is
being used as a uniqueness constraint now, not just to get an index.
But, that gets converted into part of the SQL schema, and the name
is used by the upsert, so it can't be changed.
Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:09:07 +00:00
|
|
|
-- FileKeyIndex speeds up queries that include the file, since
|
|
|
|
-- it makes there be a covering index for files. Note that, despite the name, it is
|
|
|
|
-- used as a uniqueness constraint now.
|
2016-01-11 19:52:11 +00:00
|
|
|
share [mkPersist sqlSettings, mkMigrate "migrateKeysDb"] [persistLowerCase|
|
|
|
|
Associated
|
2019-10-29 16:28:01 +00:00
|
|
|
key Key
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
file SByteString
|
2016-01-11 19:52:11 +00:00
|
|
|
KeyFileIndex key file
|
2016-01-12 17:07:14 +00:00
|
|
|
FileKeyIndex file key
|
2016-01-11 19:52:11 +00:00
|
|
|
Content
|
2019-10-29 16:28:01 +00:00
|
|
|
key Key
|
2019-10-30 17:02:16 +00:00
|
|
|
inodecache InodeCache
|
2019-10-30 19:16:03 +00:00
|
|
|
filesize FileSize
|
|
|
|
mtime EpochTime
|
2019-10-30 17:02:16 +00:00
|
|
|
KeyInodeCacheIndex key inodecache
|
2019-10-30 17:40:29 +00:00
|
|
|
InodeCacheKeyIndex inodecache key
|
2016-01-11 19:52:11 +00:00
|
|
|
|]
|
|
|
|
|
2016-01-12 17:01:44 +00:00
|
|
|
containedTable :: TableName
|
|
|
|
containedTable = "content"
|
|
|
|
|
|
|
|
createTables :: SqlPersistM ()
|
|
|
|
createTables = void $ runMigrationSilent migrateKeysDb
|
|
|
|
|
2016-01-11 19:52:11 +00:00
|
|
|
newtype ReadHandle = ReadHandle H.DbQueue
|
|
|
|
|
|
|
|
readDb :: SqlPersistM a -> ReadHandle -> IO a
|
|
|
|
readDb a (ReadHandle h) = H.queryDbQueue h a
|
|
|
|
|
|
|
|
newtype WriteHandle = WriteHandle H.DbQueue
|
|
|
|
|
|
|
|
queueDb :: SqlPersistM () -> WriteHandle -> IO ()
|
|
|
|
queueDb a (WriteHandle h) = H.queueDb h checkcommit a
|
|
|
|
where
|
2022-11-18 17:29:34 +00:00
|
|
|
-- commit queue after 10000 changes
|
|
|
|
checkcommit sz _lastcommittime = pure (sz > 10000)
|
2016-01-11 19:52:11 +00:00
|
|
|
|
faster associated file replacement with upsert
Rather than first deleting and then inserting, upsert lets the key
associated with a file be updated in place.
Benchmarked with 100,000 files, and an empty keys database, running
reconcileStaged. It improved from 47 seconds to 34 seconds.
So this got reconcileStaged to be as fast as scanAssociatedFiles,
or faster -- scanAssociatedFiles benchmarks at 37 seconds.
(Also checked for other users of deleteWhere that could be sped up by
upsert. There are a couple, but they are not in performance critical
code paths, eg recordExportTreeCurrent is only run once per tree
export.)
I would have liked to rename FileKeyIndex to FileKeyUnique since it is
being used as a uniqueness constraint now, not just to get an index.
But, that gets converted into part of the SQL schema, and the name
is used by the upsert, so it can't be changed.
Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:09:07 +00:00
|
|
|
-- Insert the associated file.
|
|
|
|
-- When the file was associated with a different key before,
|
|
|
|
-- update it to the new key.
|
2019-10-29 16:28:01 +00:00
|
|
|
addAssociatedFile :: Key -> TopFilePath -> WriteHandle -> IO ()
|
faster associated file replacement with upsert
Rather than first deleting and then inserting, upsert lets the key
associated with a file be updated in place.
Benchmarked with 100,000 files, and an empty keys database, running
reconcileStaged. It improved from 47 seconds to 34 seconds.
So this got reconcileStaged to be as fast as scanAssociatedFiles,
or faster -- scanAssociatedFiles benchmarks at 37 seconds.
(Also checked for other users of deleteWhere that could be sped up by
upsert. There are a couple, but they are not in performance critical
code paths, eg recordExportTreeCurrent is only run once per tree
export.)
I would have liked to rename FileKeyIndex to FileKeyUnique since it is
being used as a uniqueness constraint now, not just to get an index.
But, that gets converted into part of the SQL schema, and the name
is used by the upsert, so it can't be changed.
Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:09:07 +00:00
|
|
|
addAssociatedFile k f = queueDb $
|
|
|
|
void $ upsertBy
|
|
|
|
(FileKeyIndex af k)
|
|
|
|
(Associated k af)
|
|
|
|
[AssociatedFile =. af, AssociatedKey =. k]
|
Fix storing of filenames of v6 unlocked files when the filename is not representable in the current locale.
This is a mostly backwards compatable change. I broke backwards
compatability in the case where a filename starts with double-quote.
That seems likely to be very rare, and v6 unlocked files are a new feature
anyway, and fsck needs to fix missing associated file mappings anyway. So,
I decided that is good enough.
The encoding used is to just show the String when it contains a problem
character. While that adds some overhead to addAssociatedFile and
removeAssociatedFile, those are not called very often. This approach has
minimal decode overhead, because most filenames won't be encoded that way,
and it only has to look for the leading double-quote to skip the expensive
read. So, getAssociatedFiles remains fast.
I did consider using ByteString instead, but getting a FilePath converted
with all chars intact, even surrigates, is difficult, and it looks like
instance PersistField ByteString uses Text, which I don't trust for problem
encoded data. It would probably be slower too, and it would make the
database less easy to inspect manually.
2016-02-14 20:37:25 +00:00
|
|
|
where
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
af = SByteString (getTopFilePath f)
|
2016-01-11 19:52:11 +00:00
|
|
|
|
2022-11-18 17:16:57 +00:00
|
|
|
-- Faster than addAssociatedFile, but only safe to use when the file
|
|
|
|
-- was not associated with a different key before, as it does not delete
|
|
|
|
-- any old key.
|
|
|
|
newAssociatedFile :: Key -> TopFilePath -> WriteHandle -> IO ()
|
|
|
|
newAssociatedFile k f = queueDb $
|
2022-12-26 19:18:55 +00:00
|
|
|
insert_ $ Associated k af
|
2022-11-18 17:16:57 +00:00
|
|
|
where
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
af = SByteString (getTopFilePath f)
|
2022-11-18 17:16:57 +00:00
|
|
|
|
2016-01-11 19:52:11 +00:00
|
|
|
{- Note that the files returned were once associated with the key, but
|
|
|
|
- some of them may not be any longer. -}
|
2019-10-29 16:28:01 +00:00
|
|
|
getAssociatedFiles :: Key -> ReadHandle -> IO [TopFilePath]
|
|
|
|
getAssociatedFiles k = readDb $ do
|
|
|
|
l <- selectList [AssociatedKey ==. k] []
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
return $ map (asTopFilePath . (\(SByteString f) -> f) . associatedFile . entityVal) l
|
2016-01-11 19:52:11 +00:00
|
|
|
|
|
|
|
{- Gets any keys that are on record as having a particular associated file.
|
faster associated file replacement with upsert
Rather than first deleting and then inserting, upsert lets the key
associated with a file be updated in place.
Benchmarked with 100,000 files, and an empty keys database, running
reconcileStaged. It improved from 47 seconds to 34 seconds.
So this got reconcileStaged to be as fast as scanAssociatedFiles,
or faster -- scanAssociatedFiles benchmarks at 37 seconds.
(Also checked for other users of deleteWhere that could be sped up by
upsert. There are a couple, but they are not in performance critical
code paths, eg recordExportTreeCurrent is only run once per tree
export.)
I would have liked to rename FileKeyIndex to FileKeyUnique since it is
being used as a uniqueness constraint now, not just to get an index.
But, that gets converted into part of the SQL schema, and the name
is used by the upsert, so it can't be changed.
Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:09:07 +00:00
|
|
|
- (Should be one or none.) -}
|
2019-10-29 16:28:01 +00:00
|
|
|
getAssociatedKey :: TopFilePath -> ReadHandle -> IO [Key]
|
2016-01-11 19:52:11 +00:00
|
|
|
getAssociatedKey f = readDb $ do
|
2018-11-04 20:46:39 +00:00
|
|
|
l <- selectList [AssociatedFile ==. af] []
|
|
|
|
return $ map (associatedKey . entityVal) l
|
Fix storing of filenames of v6 unlocked files when the filename is not representable in the current locale.
This is a mostly backwards compatable change. I broke backwards
compatability in the case where a filename starts with double-quote.
That seems likely to be very rare, and v6 unlocked files are a new feature
anyway, and fsck needs to fix missing associated file mappings anyway. So,
I decided that is good enough.
The encoding used is to just show the String when it contains a problem
character. While that adds some overhead to addAssociatedFile and
removeAssociatedFile, those are not called very often. This approach has
minimal decode overhead, because most filenames won't be encoded that way,
and it only has to look for the leading double-quote to skip the expensive
read. So, getAssociatedFiles remains fast.
I did consider using ByteString instead, but getting a FilePath converted
with all chars intact, even surrigates, is difficult, and it looks like
instance PersistField ByteString uses Text, which I don't trust for problem
encoded data. It would probably be slower too, and it would make the
database less easy to inspect manually.
2016-02-14 20:37:25 +00:00
|
|
|
where
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
af = SByteString (getTopFilePath f)
|
2016-01-11 19:52:11 +00:00
|
|
|
|
2019-10-29 16:28:01 +00:00
|
|
|
removeAssociatedFile :: Key -> TopFilePath -> WriteHandle -> IO ()
|
|
|
|
removeAssociatedFile k f = queueDb $
|
|
|
|
deleteWhere [AssociatedKey ==. k, AssociatedFile ==. af]
|
Fix storing of filenames of v6 unlocked files when the filename is not representable in the current locale.
This is a mostly backwards compatable change. I broke backwards
compatability in the case where a filename starts with double-quote.
That seems likely to be very rare, and v6 unlocked files are a new feature
anyway, and fsck needs to fix missing associated file mappings anyway. So,
I decided that is good enough.
The encoding used is to just show the String when it contains a problem
character. While that adds some overhead to addAssociatedFile and
removeAssociatedFile, those are not called very often. This approach has
minimal decode overhead, because most filenames won't be encoded that way,
and it only has to look for the leading double-quote to skip the expensive
read. So, getAssociatedFiles remains fast.
I did consider using ByteString instead, but getting a FilePath converted
with all chars intact, even surrigates, is difficult, and it looks like
instance PersistField ByteString uses Text, which I don't trust for problem
encoded data. It would probably be slower too, and it would make the
database less easy to inspect manually.
2016-02-14 20:37:25 +00:00
|
|
|
where
|
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.
Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.
Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.
Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.
Sponsored-by: unqueued on Patreon
2023-10-23 20:12:26 +00:00
|
|
|
af = SByteString (getTopFilePath f)
|
2016-01-11 19:52:11 +00:00
|
|
|
|
2019-10-29 16:28:01 +00:00
|
|
|
addInodeCaches :: Key -> [InodeCache] -> WriteHandle -> IO ()
|
|
|
|
addInodeCaches k is = queueDb $
|
2023-03-31 18:34:18 +00:00
|
|
|
forM_ is $ \i -> insertUniqueFast $ Content k i
|
2019-10-30 19:16:03 +00:00
|
|
|
(inodeCacheToFileSize i)
|
|
|
|
(inodeCacheToEpochTime i)
|
2016-01-11 19:52:11 +00:00
|
|
|
|
|
|
|
{- A key may have multiple InodeCaches; one for the annex object, and one
|
|
|
|
- for each pointer file that is a copy of it. -}
|
2019-10-29 16:28:01 +00:00
|
|
|
getInodeCaches :: Key -> ReadHandle -> IO [InodeCache]
|
|
|
|
getInodeCaches k = readDb $ do
|
|
|
|
l <- selectList [ContentKey ==. k] []
|
2019-10-30 17:02:16 +00:00
|
|
|
return $ map (contentInodecache . entityVal) l
|
2016-01-11 19:52:11 +00:00
|
|
|
|
2019-10-29 16:28:01 +00:00
|
|
|
removeInodeCaches :: Key -> WriteHandle -> IO ()
|
|
|
|
removeInodeCaches k = queueDb $
|
|
|
|
deleteWhere [ContentKey ==. k]
|
2019-10-23 18:06:11 +00:00
|
|
|
|
smudge: check for known annexed inodes before checking annex.largefiles
smudge: Fix a case where an unlocked annexed file that annex.largefiles
does not match could get its unchanged content checked into git, due to git
running the smudge filter unecessarily.
When the file has the same inodecache as an already annexed file,
we can assume that the user is not intending to change how it's stored in
git.
Note that checkunchangedgitfile already handled the inverse case, where the
file was added to git previously. That goes further and actually sha1
hashes the new file and checks if it's the same hash in the index.
It would be possible to generate a key for the file and see if it's the
same as the old key, however that could be considerably more expensive than
sha1 of a small file is, and it is not necessary for the case I have, at
least, where the file is not modified or touched, and so its inode will
match the cache.
git-annex add was changed, when adding a small file, to remove the inode
cache for it. This is necessary to keep the recipe in
doc/tips/largefiles.mdwn for converting from annex to git working.
It also avoids bugs/case_where_using_pathspec_with_git-commit_leaves_s.mdwn
which the earlier try at this change introduced.
2021-05-10 17:05:08 +00:00
|
|
|
removeInodeCache :: InodeCache -> WriteHandle -> IO ()
|
|
|
|
removeInodeCache i = queueDb $ deleteWhere
|
|
|
|
[ ContentInodecache ==. i
|
|
|
|
]
|
|
|
|
|
2019-10-30 19:16:03 +00:00
|
|
|
{- Check if the inode is known to be used for an annexed file. -}
|
2019-10-23 18:06:11 +00:00
|
|
|
isInodeKnown :: InodeCache -> SentinalStatus -> ReadHandle -> IO Bool
|
2019-10-30 19:16:03 +00:00
|
|
|
isInodeKnown i s = readDb (isJust <$> selectFirst q [])
|
2019-10-23 18:06:11 +00:00
|
|
|
where
|
2019-10-30 19:16:03 +00:00
|
|
|
q
|
2019-10-23 18:06:11 +00:00
|
|
|
| sentinalInodesChanged s =
|
2019-10-30 19:16:03 +00:00
|
|
|
-- Note that this select is intentionally not
|
|
|
|
-- indexed. Normally, the inodes have not changed,
|
2022-08-19 21:45:04 +00:00
|
|
|
-- and it would be unnecessary work to maintain
|
2019-10-30 19:16:03 +00:00
|
|
|
-- indexes for the unusual case.
|
|
|
|
[ ContentFilesize ==. inodeCacheToFileSize i
|
|
|
|
, ContentMtime >=. tmin
|
|
|
|
, ContentMtime <=. tmax
|
|
|
|
]
|
|
|
|
| otherwise = [ContentInodecache ==. i]
|
|
|
|
(tmin, tmax) = inodeCacheEpochTimeRange i
|