remove wrong uniqueness constraint from ContentIdentifier db

Fix bug that caused importing from a special remote to repeatedly download
unchanged files when multiple files in the remote have the same content.

Unfortunately, there's really no good way to remove a uniqueness constraint
from a sqlite database. The best that can be done is to make a new table
and copy the data over. But that would require using persistent's
migrations or raw sql, and I don't want to do either.

Instead, a sledgehammer approach: Renamed .git/annex/cid to
.git/annex/cids. When the new database doesn't exist, it will be populated
from the git-annex branch.

Noting deletes the old database. Don't want to delete it out from under
some long-running git-annex process that might be using it. It could
eventually be deleted. But this is such a new feature, probably few repos
have the database in any case.
This commit is contained in:
Joey Hess 2019-04-09 19:58:24 -04:00
parent 5ece1408ae
commit 6babb2c73f
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 13 additions and 7 deletions

View file

@ -355,9 +355,13 @@ gitAnnexExportLock u r = gitAnnexExportDbDir u r ++ ".lck"
gitAnnexExportUpdateLock :: UUID -> Git.Repo -> FilePath gitAnnexExportUpdateLock :: UUID -> Git.Repo -> FilePath
gitAnnexExportUpdateLock u r = gitAnnexExportDbDir u r ++ ".upl" gitAnnexExportUpdateLock u r = gitAnnexExportDbDir u r ++ ".upl"
{- Directory containing database used to record remote content ids. -} {- Directory containing database used to record remote content ids.
-
- (This used to be "cid", but a problem with the database caused it to
- need to be rebuilt with a new name.)
-}
gitAnnexContentIdentifierDbDir :: Git.Repo -> FilePath gitAnnexContentIdentifierDbDir :: Git.Repo -> FilePath
gitAnnexContentIdentifierDbDir r = gitAnnexDir r </> "cid" gitAnnexContentIdentifierDbDir r = gitAnnexDir r </> "cids"
{- Lock file for writing to the content id database. -} {- Lock file for writing to the content id database. -}
gitAnnexContentIdentifierLock :: Git.Repo -> FilePath gitAnnexContentIdentifierLock :: Git.Repo -> FilePath

View file

@ -4,6 +4,9 @@ git-annex (7.20190323) UNRELEASED; urgency=medium
to allow git-annex import of files from an Android device. This can be to allow git-annex import of files from an Android device. This can be
combined with exporttree=yes and git-annex export used to send changes combined with exporttree=yes and git-annex export used to send changes
back to the Android device. back to the Android device.
* Fix bug that caused importing from a special remote to repeatedly
download unchanged files when multiple files in the remote have the same
content.
-- Joey Hess <id@joeyh.name> Tue, 09 Apr 2019 14:07:53 -0400 -- Joey Hess <id@joeyh.name> Tue, 09 Apr 2019 14:07:53 -0400

View file

@ -6,7 +6,7 @@
-} -}
{-# LANGUAGE QuasiQuotes, TypeFamilies, TemplateHaskell #-} {-# LANGUAGE QuasiQuotes, TypeFamilies, TemplateHaskell #-}
{-# LANGUAGE OverloadedStrings, GADTs, FlexibleContexts #-} {-# LANGUAGE OverloadedStrings, GADTs, FlexibleContexts, EmptyDataDecls #-}
{-# LANGUAGE MultiParamTypeClasses, GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE RankNTypes #-} {-# LANGUAGE RankNTypes #-}
@ -51,9 +51,6 @@ ContentIdentifiers
remote UUID remote UUID
cid ContentIdentifier cid ContentIdentifier
key IKey key IKey
ContentIdentifiersIndexRemoteKey remote key
ContentIdentifiersIndexRemoteCID remote cid
UniqueRemoteCidKey remote cid key
-- The last git-annex branch tree sha that was used to update -- The last git-annex branch tree sha that was used to update
-- ContentIdentifiers -- ContentIdentifiers
AnnexBranch AnnexBranch
@ -93,7 +90,7 @@ flushDbQueue (ContentIdentifierHandle h) = H.flushDbQueue h
-- Be sure to also update the git-annex branch when using this. -- Be sure to also update the git-annex branch when using this.
recordContentIdentifier :: ContentIdentifierHandle -> UUID -> ContentIdentifier -> Key -> IO () recordContentIdentifier :: ContentIdentifierHandle -> UUID -> ContentIdentifier -> Key -> IO ()
recordContentIdentifier h u cid k = queueDb h $ do recordContentIdentifier h u cid k = queueDb h $ do
void $ insertUnique $ ContentIdentifiers u cid (toIKey k) void $ insert_ $ ContentIdentifiers u cid (toIKey k)
getContentIdentifiers :: ContentIdentifierHandle -> UUID -> Key -> IO [ContentIdentifier] getContentIdentifiers :: ContentIdentifierHandle -> UUID -> Key -> IO [ContentIdentifier]
getContentIdentifiers (ContentIdentifierHandle h) u k = H.queryDbQueue h $ do getContentIdentifiers (ContentIdentifierHandle h) u k = H.queryDbQueue h $ do

View file

@ -7,3 +7,5 @@ unncessarly importing in that case. --[[Joey]]
Seems that the ContentIdentifier database can actually only store one cid Seems that the ContentIdentifier database can actually only store one cid
for a given key at a time, not multiples needed by this. This needs a for a given key at a time, not multiples needed by this. This needs a
change to the db schema to fix, unfortunately. change to the db schema to fix, unfortunately.
> [[done]] --[[Joey]]