2019-02-20 21:22:56 +00:00
|
|
|
{- Remote content identifier logs.
|
|
|
|
-
|
|
|
|
- Copyright 2019 Joey Hess <id@joeyh.name>
|
|
|
|
-
|
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
|
|
|
-}
|
|
|
|
|
|
|
|
module Logs.ContentIdentifier (
|
2019-02-21 16:22:50 +00:00
|
|
|
module X,
|
2019-02-20 21:22:56 +00:00
|
|
|
recordContentIdentifier,
|
|
|
|
getContentIdentifiers,
|
|
|
|
) where
|
|
|
|
|
|
|
|
import Annex.Common
|
|
|
|
import Logs
|
|
|
|
import Logs.MapLog
|
2019-02-21 17:43:21 +00:00
|
|
|
import Types.Import
|
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:
* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
could in theory generate the same content identifier for two different
peices of content
While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.
External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.
Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 16:33:27 +00:00
|
|
|
import Types.RemoteState
|
2019-02-20 21:22:56 +00:00
|
|
|
import qualified Annex.Branch
|
2019-02-21 16:22:50 +00:00
|
|
|
import Logs.ContentIdentifier.Pure as X
|
2019-02-20 21:22:56 +00:00
|
|
|
import qualified Annex
|
|
|
|
|
|
|
|
import qualified Data.Map as M
|
2019-03-06 18:17:33 +00:00
|
|
|
import Data.List.NonEmpty (NonEmpty(..))
|
2019-03-06 22:04:30 +00:00
|
|
|
import qualified Data.List.NonEmpty as NonEmpty
|
2019-02-20 21:22:56 +00:00
|
|
|
|
|
|
|
-- | Records a remote's content identifier and the key that it corresponds to.
|
|
|
|
--
|
|
|
|
-- A remote may use multiple content identifiers for the same key over time,
|
|
|
|
-- so ones that were recorded before are preserved.
|
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:
* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
could in theory generate the same content identifier for two different
peices of content
While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.
External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.
Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 16:33:27 +00:00
|
|
|
recordContentIdentifier :: RemoteStateHandle -> ContentIdentifier -> Key -> Annex ()
|
|
|
|
recordContentIdentifier (RemoteStateHandle u) cid k = do
|
2020-12-23 19:21:33 +00:00
|
|
|
c <- currentVectorClock
|
2019-02-20 21:22:56 +00:00
|
|
|
config <- Annex.getGitConfig
|
2020-12-22 16:00:11 +00:00
|
|
|
Annex.Branch.maybeChange (remoteContentIdentifierLogFile config k) $
|
|
|
|
addcid c . parseLog
|
2019-02-20 21:22:56 +00:00
|
|
|
where
|
2020-12-22 16:00:11 +00:00
|
|
|
addcid c v
|
|
|
|
| cid `elem` l = Nothing -- no change needed
|
|
|
|
| otherwise = Just $ buildLog $
|
|
|
|
changeMapLog c u (cid :| l) v
|
2019-02-20 21:22:56 +00:00
|
|
|
where
|
2020-12-22 16:00:11 +00:00
|
|
|
m = simpleMap v
|
|
|
|
l = contentIdentifierList (M.lookup u m)
|
2019-02-20 21:22:56 +00:00
|
|
|
|
2019-03-06 22:04:30 +00:00
|
|
|
-- | Get all known content identifiers for a key.
|
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:
* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
could in theory generate the same content identifier for two different
peices of content
While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.
External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.
Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 16:33:27 +00:00
|
|
|
getContentIdentifiers :: Key -> Annex [(RemoteStateHandle, [ContentIdentifier])]
|
2019-03-06 22:04:30 +00:00
|
|
|
getContentIdentifiers k = do
|
2019-02-20 21:22:56 +00:00
|
|
|
config <- Annex.getGitConfig
|
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:
* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
could in theory generate the same content identifier for two different
peices of content
While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.
External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.
Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 16:33:27 +00:00
|
|
|
map (\(u, l) -> (RemoteStateHandle u, NonEmpty.toList l) )
|
2019-03-06 22:04:30 +00:00
|
|
|
. M.toList . simpleMap . parseLog
|
2019-02-20 21:22:56 +00:00
|
|
|
<$> Annex.Branch.get (remoteContentIdentifierLogFile config k)
|