make sure all sqlite selects have indexes

Bearing in mind that these indexes are really uniqueness constraints
that just happen to also make sqlite generate indexes.

In Database.ContentIndentifier, the ContentIndentifiersKeyRemoteCidIndex
is fine as a uniqueness constraint because it contains all rows from the
table. The ContentIndentifiersCidRemoteIndex is also ok because there
can only be one key for a given (cid, uuid) combination.

In Database.Export, the new ExportTreeFileKeyIndex is the same pair as
the old ExportTreeKeyFileIndex (previously ExportTreeIndex). And
in Database.Keys.SQL, the new InodeCacheKeyIndex is the same pair as the
old KeyInodeCacheIndex.
This commit is contained in:
Joey Hess 2019-10-30 13:40:29 -04:00
parent 3732f27722
commit 9085a2cfec
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 10 additions and 19 deletions

View file

@ -53,6 +53,8 @@ ContentIdentifiers
remote UUID
cid ContentIdentifier
key Key
ContentIndentifiersKeyRemoteCidIndex key remote cid
ContentIndentifiersCidRemoteIndex cid remote
-- The last git-annex branch tree sha that was used to update
-- ContentIdentifiers
AnnexBranch

View file

@ -76,7 +76,8 @@ ExportedDirectory
ExportTree
key Key
file SFilePath
ExportTreeIndex key file
ExportTreeKeyFileIndex key file
ExportTreeFileKeyIndex file key
-- The tree stored in ExportTree
ExportTreeCurrent
tree SSha
@ -165,9 +166,6 @@ getExportTree (ExportHandle h _) k = H.queryDbQueue h $ do
return $ map (mkExportLocation . fromSFilePath . exportTreeFile . entityVal) l
{- Get keys that might be currently exported to a location.
-
- Note that the database does not currently have an index to make this
- fast.
-
- Note that this does not see recently queued changes.
-}

View file

@ -40,7 +40,7 @@ data FsckHandle = FsckHandle H.DbQueue UUID
share [mkPersist sqlSettings, mkMigrate "migrateFsck"] [persistLowerCase|
Fscked
key Key
UniqueKey key
FsckedKeyIndex key
|]
{- The database is removed when starting a new incremental fsck pass.

View file

@ -52,6 +52,7 @@ Content
key Key
inodecache InodeCache
KeyInodeCacheIndex key inodecache
InodeCacheKeyIndex inodecache key
|]
containedTable :: TableName

View file

@ -2,24 +2,14 @@ Collection of non-ideal things about git-annex's use of sqlite databases.
Would be good to improve these sometime, but it would need a migration
process.
* Database.Keys.SQL.isInodeKnown seems likely to get very slow
when there are a lot of unlocked annexed files. It needs
an index in the database, eg "InodeIndex cache"
It also has to do some really ugly SQL LIKE queries. Probably an index
would not speed them up. They're only needed when git-annex detects
inodes are not stable, eg on fat or probably windows. A better database
* Database.Keys.SQL.isInodeKnown has some really ugly SQL LIKE queries.
Probably an index would not speed them up. They're only needed when
git-annex detects inodes are not stable, eg on fat or probably windows.
A better database
schema should be able to eliminate the need for those LIKE queries.
Eg, store the size and allowable mtimes in a separate table that is
queried when necessary.
* Database.Export.getExportedKey would be faster if there was an index
in the database, eg "ExportedIndex file key". This only affects
the speed of `git annex export`, which is probably swamped by the actual
upload of the data to the remote.
* There may be other selects elsewhere that are not indexed.
* Database.Types has some suboptimal encodings for Key and InodeCache.
They are both slow due to being implemented using String
(which may be fixable w/o changing the DB schema),