make sure all sqlite selects have indexes

Bearing in mind that these indexes are really uniqueness constraints
that just happen to also make sqlite generate indexes.

In Database.ContentIndentifier, the ContentIndentifiersKeyRemoteCidIndex
is fine as a uniqueness constraint because it contains all rows from the
table. The ContentIndentifiersCidRemoteIndex is also ok because there
can only be one key for a given (cid, uuid) combination.

In Database.Export, the new ExportTreeFileKeyIndex is the same pair as
the old ExportTreeKeyFileIndex (previously ExportTreeIndex). And
in Database.Keys.SQL, the new InodeCacheKeyIndex is the same pair as the
old KeyInodeCacheIndex.
This commit is contained in:
Joey Hess 2019-10-30 13:40:29 -04:00
parent 3732f27722
commit 9085a2cfec
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 10 additions and 19 deletions

View file

@ -53,6 +53,8 @@ ContentIdentifiers
remote UUID remote UUID
cid ContentIdentifier cid ContentIdentifier
key Key key Key
ContentIndentifiersKeyRemoteCidIndex key remote cid
ContentIndentifiersCidRemoteIndex cid remote
-- The last git-annex branch tree sha that was used to update -- The last git-annex branch tree sha that was used to update
-- ContentIdentifiers -- ContentIdentifiers
AnnexBranch AnnexBranch

View file

@ -76,7 +76,8 @@ ExportedDirectory
ExportTree ExportTree
key Key key Key
file SFilePath file SFilePath
ExportTreeIndex key file ExportTreeKeyFileIndex key file
ExportTreeFileKeyIndex file key
-- The tree stored in ExportTree -- The tree stored in ExportTree
ExportTreeCurrent ExportTreeCurrent
tree SSha tree SSha
@ -165,9 +166,6 @@ getExportTree (ExportHandle h _) k = H.queryDbQueue h $ do
return $ map (mkExportLocation . fromSFilePath . exportTreeFile . entityVal) l return $ map (mkExportLocation . fromSFilePath . exportTreeFile . entityVal) l
{- Get keys that might be currently exported to a location. {- Get keys that might be currently exported to a location.
-
- Note that the database does not currently have an index to make this
- fast.
- -
- Note that this does not see recently queued changes. - Note that this does not see recently queued changes.
-} -}

View file

@ -40,7 +40,7 @@ data FsckHandle = FsckHandle H.DbQueue UUID
share [mkPersist sqlSettings, mkMigrate "migrateFsck"] [persistLowerCase| share [mkPersist sqlSettings, mkMigrate "migrateFsck"] [persistLowerCase|
Fscked Fscked
key Key key Key
UniqueKey key FsckedKeyIndex key
|] |]
{- The database is removed when starting a new incremental fsck pass. {- The database is removed when starting a new incremental fsck pass.

View file

@ -52,6 +52,7 @@ Content
key Key key Key
inodecache InodeCache inodecache InodeCache
KeyInodeCacheIndex key inodecache KeyInodeCacheIndex key inodecache
InodeCacheKeyIndex inodecache key
|] |]
containedTable :: TableName containedTable :: TableName

View file

@ -2,24 +2,14 @@ Collection of non-ideal things about git-annex's use of sqlite databases.
Would be good to improve these sometime, but it would need a migration Would be good to improve these sometime, but it would need a migration
process. process.
* Database.Keys.SQL.isInodeKnown seems likely to get very slow * Database.Keys.SQL.isInodeKnown has some really ugly SQL LIKE queries.
when there are a lot of unlocked annexed files. It needs Probably an index would not speed them up. They're only needed when
an index in the database, eg "InodeIndex cache" git-annex detects inodes are not stable, eg on fat or probably windows.
A better database
It also has to do some really ugly SQL LIKE queries. Probably an index
would not speed them up. They're only needed when git-annex detects
inodes are not stable, eg on fat or probably windows. A better database
schema should be able to eliminate the need for those LIKE queries. schema should be able to eliminate the need for those LIKE queries.
Eg, store the size and allowable mtimes in a separate table that is Eg, store the size and allowable mtimes in a separate table that is
queried when necessary. queried when necessary.
* Database.Export.getExportedKey would be faster if there was an index
in the database, eg "ExportedIndex file key". This only affects
the speed of `git annex export`, which is probably swamped by the actual
upload of the data to the remote.
* There may be other selects elsewhere that are not indexed.
* Database.Types has some suboptimal encodings for Key and InodeCache. * Database.Types has some suboptimal encodings for Key and InodeCache.
They are both slow due to being implemented using String They are both slow due to being implemented using String
(which may be fixable w/o changing the DB schema), (which may be fixable w/o changing the DB schema),