2013-08-29 22:51:22 +00:00
|
|
|
{- git-annex log file names
|
|
|
|
-
|
add equivilant key log for VURL keys
When downloading a VURL from the web, make sure that the equivilant key
log is populated.
Unfortunately, this does not hash the content while it's being
downloaded from the web. There is not an interface in Backend currently
for incrementally hash generation, only for incremental verification of an
existing hash. So this might add a noticiable delay, and it has to show
a "(checksum...") message. This could stand to be improved.
But, that separate hashing step only has to happen on the first download
of new content from the web. Once the hash is known, the VURL key can have
its hash verified incrementally while downloading except when the
content in the web has changed. (Doesn't happen yet because
verifyKeyContentIncrementally is not implemented yet for VURL keys.)
Note that the equivilant key log file is formatted as a presence log.
This adds a tiny bit of overhead (eg "1 ") per line over just listing the
urls. The reason I chose to use that format is it seems possible that
there will need to be a way to remove an equivilant key at some point in
the future. I don't know why that would be necessary, but it seemed wise
to allow for the possibility.
Downloads of VURL keys from other special remotes that claim urls,
like bittorrent for example, does not popilate the equivilant key log.
So for now, no checksum verification will be done for those.
Sponsored-by: Nicholas Golder-Manning on Patreon
2024-02-29 19:41:57 +00:00
|
|
|
- Copyright 2013-2024 Joey Hess <id@joeyh.name>
|
2013-08-29 22:51:22 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2013-08-29 22:51:22 +00:00
|
|
|
-}
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
{-# LANGUAGE OverloadedStrings #-}
|
|
|
|
|
2013-08-29 22:51:22 +00:00
|
|
|
module Logs where
|
|
|
|
|
2016-01-20 20:36:33 +00:00
|
|
|
import Annex.Common
|
2015-01-28 21:17:26 +00:00
|
|
|
import Annex.DirHashes
|
2013-08-29 22:51:22 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
import qualified Data.ByteString as S
|
2019-12-11 18:12:22 +00:00
|
|
|
import qualified System.FilePath.ByteString as P
|
2019-11-26 19:27:22 +00:00
|
|
|
|
add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.
GETSTATE and SETSTATE are added to the external special remote protocol.
Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.
The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.
This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.
This commit was sponsored by Daniel Hofer.
2014-01-03 20:35:57 +00:00
|
|
|
{- There are several varieties of log file formats. -}
|
2014-01-20 20:47:56 +00:00
|
|
|
data LogVariety
|
2019-02-21 17:43:21 +00:00
|
|
|
= OldUUIDBasedLog
|
2014-01-20 20:47:56 +00:00
|
|
|
| NewUUIDBasedLog
|
2014-07-24 20:23:36 +00:00
|
|
|
| ChunkLog Key
|
2021-05-13 18:43:25 +00:00
|
|
|
| LocationLog Key
|
|
|
|
| UrlLog Key
|
2018-09-05 17:20:10 +00:00
|
|
|
| RemoteMetaDataLog
|
2014-02-13 01:12:22 +00:00
|
|
|
| OtherLog
|
2013-08-29 22:51:22 +00:00
|
|
|
deriving (Show)
|
|
|
|
|
|
|
|
{- Converts a path from the git-annex branch into one of the varieties
|
|
|
|
- of logs used by git-annex, if it's a known path. -}
|
2020-02-14 19:22:48 +00:00
|
|
|
getLogVariety :: GitConfig -> RawFilePath -> Maybe LogVariety
|
|
|
|
getLogVariety config f
|
2019-02-21 17:43:21 +00:00
|
|
|
| f `elem` topLevelOldUUIDBasedLogs = Just OldUUIDBasedLog
|
2019-02-23 01:34:31 +00:00
|
|
|
| f `elem` topLevelNewUUIDBasedLogs = Just NewUUIDBasedLog
|
|
|
|
| isRemoteStateLog f = Just NewUUIDBasedLog
|
|
|
|
| isRemoteContentIdentifierLog f = Just NewUUIDBasedLog
|
2018-09-05 17:20:10 +00:00
|
|
|
| isRemoteMetaDataLog f = Just RemoteMetaDataLog
|
add equivilant key log for VURL keys
When downloading a VURL from the web, make sure that the equivilant key
log is populated.
Unfortunately, this does not hash the content while it's being
downloaded from the web. There is not an interface in Backend currently
for incrementally hash generation, only for incremental verification of an
existing hash. So this might add a noticiable delay, and it has to show
a "(checksum...") message. This could stand to be improved.
But, that separate hashing step only has to happen on the first download
of new content from the web. Once the hash is known, the VURL key can have
its hash verified incrementally while downloading except when the
content in the web has changed. (Doesn't happen yet because
verifyKeyContentIncrementally is not implemented yet for VURL keys.)
Note that the equivilant key log file is formatted as a presence log.
This adds a tiny bit of overhead (eg "1 ") per line over just listing the
urls. The reason I chose to use that format is it seems possible that
there will need to be a way to remove an equivilant key at some point in
the future. I don't know why that would be necessary, but it seemed wise
to allow for the possibility.
Downloads of VURL keys from other special remotes that claim urls,
like bittorrent for example, does not popilate the equivilant key log.
So for now, no checksum verification will be done for those.
Sponsored-by: Nicholas Golder-Manning on Patreon
2024-02-29 19:41:57 +00:00
|
|
|
| isMetaDataLog f
|
|
|
|
|| f `elem` otherTopLevelLogs
|
|
|
|
|| isEquivilantKeyLog f = Just OtherLog
|
2021-05-13 18:43:25 +00:00
|
|
|
| otherwise = (LocationLog <$> locationLogFileKey config f)
|
|
|
|
<|> (ChunkLog <$> extLogFileKey chunkLogExt f)
|
|
|
|
<|> (UrlLog <$> urlLogFileKey f)
|
2013-08-29 22:51:22 +00:00
|
|
|
|
cache one more log file for metadata
My worry was that a preferred content expression that matches on metadata
would have removed the location log from cache, causing an expensive
re-read when a Seek action later checked the location log.
Especially when the --all optimisation in the previous commit
pre-cached the location log.
This also means that the --all optimisation could cache the metadata log
too, if it wanted too, but not currently done.
The cache is a list, with the most recently accessed file first. That
optimises it for the common case of reading the same file twice, eg a
get, examine, followed by set reads it twice. And sync --content reads the
location log 3 times in a row commonly.
But, as a list, it should not be made to be too long. I thought about
expanding it to 5 items, but that seemed unlikely to be a win commonly
enough to outweigh the extra time spent checking the cache.
Clearly there could be some further benchmarking and tuning here.
2020-07-07 18:18:55 +00:00
|
|
|
{- Typical number of log files that may be read while processing a single
|
|
|
|
- key. This is used to size a cache.
|
|
|
|
-
|
|
|
|
- The location log is generally read, and the metadata log is read when
|
|
|
|
- matching a preferred content expression that matches on metadata,
|
|
|
|
- or when using metadata options.
|
|
|
|
-
|
|
|
|
- When using a remote, the url log, chunk log, remote state log, remote
|
|
|
|
- metadata log, and remote content identifier log might each be used,
|
|
|
|
- but probably at most 3 out of the 6. However, caching too much slows
|
|
|
|
- down all operations because the cache is a linear list, so the cache
|
|
|
|
- is not currently sized to include these.
|
|
|
|
-
|
|
|
|
- The result is that when seeking for files to operate on,
|
|
|
|
- the location log will stay in the cache if the metadata log is also
|
|
|
|
- read.
|
|
|
|
-}
|
|
|
|
logFilesToCache :: Int
|
|
|
|
logFilesToCache = 2
|
|
|
|
|
2021-05-13 18:57:38 +00:00
|
|
|
{- All the log files that might contain information about a key. -}
|
|
|
|
keyLogFiles :: GitConfig -> Key -> [RawFilePath]
|
|
|
|
keyLogFiles config k =
|
|
|
|
[ locationLogFile config k
|
|
|
|
, urlLogFile config k
|
|
|
|
, remoteStateLogFile config k
|
|
|
|
, metaDataLogFile config k
|
|
|
|
, remoteMetaDataLogFile config k
|
|
|
|
, remoteContentIdentifierLogFile config k
|
|
|
|
, chunkLogFile config k
|
add equivilant key log for VURL keys
When downloading a VURL from the web, make sure that the equivilant key
log is populated.
Unfortunately, this does not hash the content while it's being
downloaded from the web. There is not an interface in Backend currently
for incrementally hash generation, only for incremental verification of an
existing hash. So this might add a noticiable delay, and it has to show
a "(checksum...") message. This could stand to be improved.
But, that separate hashing step only has to happen on the first download
of new content from the web. Once the hash is known, the VURL key can have
its hash verified incrementally while downloading except when the
content in the web has changed. (Doesn't happen yet because
verifyKeyContentIncrementally is not implemented yet for VURL keys.)
Note that the equivilant key log file is formatted as a presence log.
This adds a tiny bit of overhead (eg "1 ") per line over just listing the
urls. The reason I chose to use that format is it seems possible that
there will need to be a way to remove an equivilant key at some point in
the future. I don't know why that would be necessary, but it seemed wise
to allow for the possibility.
Downloads of VURL keys from other special remotes that claim urls,
like bittorrent for example, does not popilate the equivilant key log.
So for now, no checksum verification will be done for those.
Sponsored-by: Nicholas Golder-Manning on Patreon
2024-02-29 19:41:57 +00:00
|
|
|
, equivilantKeysLogFile config k
|
2021-05-13 18:57:38 +00:00
|
|
|
] ++ oldurlLogs config k
|
|
|
|
|
2021-05-17 17:24:58 +00:00
|
|
|
{- All uuid-based logs stored in the top of the git-annex branch. -}
|
|
|
|
topLevelUUIDBasedLogs :: [RawFilePath]
|
|
|
|
topLevelUUIDBasedLogs = topLevelNewUUIDBasedLogs ++ topLevelOldUUIDBasedLogs
|
2021-05-13 18:57:38 +00:00
|
|
|
|
2019-02-23 01:34:31 +00:00
|
|
|
{- All the old-format uuid-based logs stored in the top of the git-annex branch. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
topLevelOldUUIDBasedLogs :: [RawFilePath]
|
2019-02-21 17:43:21 +00:00
|
|
|
topLevelOldUUIDBasedLogs =
|
2013-08-29 22:51:22 +00:00
|
|
|
[ uuidLog
|
|
|
|
, remoteLog
|
|
|
|
, trustLog
|
|
|
|
, groupLog
|
|
|
|
, preferredContentLog
|
2014-03-29 18:43:34 +00:00
|
|
|
, requiredContentLog
|
2013-10-07 20:06:34 +00:00
|
|
|
, scheduleLog
|
2015-04-05 16:50:02 +00:00
|
|
|
, activityLog
|
2015-01-27 21:38:06 +00:00
|
|
|
, differenceLog
|
2017-03-30 23:32:58 +00:00
|
|
|
, multicastLog
|
2013-08-29 22:51:22 +00:00
|
|
|
]
|
|
|
|
|
2019-02-23 01:34:31 +00:00
|
|
|
{- All the new-format uuid-based logs stored in the top of the git-annex branch. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
topLevelNewUUIDBasedLogs :: [RawFilePath]
|
2019-02-23 01:34:31 +00:00
|
|
|
topLevelNewUUIDBasedLogs =
|
|
|
|
[ exportLog
|
|
|
|
]
|
|
|
|
|
2021-05-13 18:43:25 +00:00
|
|
|
{- Other top-level logs. -}
|
|
|
|
otherTopLevelLogs :: [RawFilePath]
|
|
|
|
otherTopLevelLogs =
|
2014-03-15 17:44:31 +00:00
|
|
|
[ numcopiesLog
|
2021-01-06 18:11:08 +00:00
|
|
|
, mincopiesLog
|
2021-05-13 16:48:56 +00:00
|
|
|
, configLog
|
2014-03-15 17:44:31 +00:00
|
|
|
, groupPreferredContentLog
|
|
|
|
]
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
uuidLog :: RawFilePath
|
2013-08-29 22:51:22 +00:00
|
|
|
uuidLog = "uuid.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
numcopiesLog :: RawFilePath
|
2014-01-20 20:47:56 +00:00
|
|
|
numcopiesLog = "numcopies.log"
|
|
|
|
|
2021-01-06 18:11:08 +00:00
|
|
|
mincopiesLog :: RawFilePath
|
|
|
|
mincopiesLog = "mincopies.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
configLog :: RawFilePath
|
2017-01-30 20:41:29 +00:00
|
|
|
configLog = "config.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteLog :: RawFilePath
|
2013-08-29 22:51:22 +00:00
|
|
|
remoteLog = "remote.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
trustLog :: RawFilePath
|
2013-08-29 22:51:22 +00:00
|
|
|
trustLog = "trust.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
groupLog :: RawFilePath
|
2013-08-29 22:51:22 +00:00
|
|
|
groupLog = "group.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
preferredContentLog :: RawFilePath
|
2013-08-29 22:51:22 +00:00
|
|
|
preferredContentLog = "preferred-content.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
requiredContentLog :: RawFilePath
|
2014-03-29 18:43:34 +00:00
|
|
|
requiredContentLog = "required-content.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
groupPreferredContentLog :: RawFilePath
|
2014-03-15 17:44:31 +00:00
|
|
|
groupPreferredContentLog = "group-preferred-content.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
scheduleLog :: RawFilePath
|
2013-10-07 20:06:34 +00:00
|
|
|
scheduleLog = "schedule.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
activityLog :: RawFilePath
|
2015-04-05 16:50:02 +00:00
|
|
|
activityLog = "activity.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
differenceLog :: RawFilePath
|
2015-01-27 21:38:06 +00:00
|
|
|
differenceLog = "difference.log"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
multicastLog :: RawFilePath
|
2017-03-30 23:32:58 +00:00
|
|
|
multicastLog = "multicast.log"
|
2015-04-05 16:50:02 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
exportLog :: RawFilePath
|
2017-08-31 19:41:48 +00:00
|
|
|
exportLog = "export.log"
|
|
|
|
|
2021-04-13 19:00:23 +00:00
|
|
|
{- This is not a log file, it's where exported treeishes get grafted into
|
|
|
|
- the git-annex branch. -}
|
|
|
|
exportTreeGraftPoint :: RawFilePath
|
|
|
|
exportTreeGraftPoint = "export.tree"
|
|
|
|
|
2023-12-06 19:38:01 +00:00
|
|
|
{- This is not a log file, it's where migration treeishes get grafted into
|
|
|
|
- the git-annex branch. -}
|
|
|
|
migrationTreeGraftPoint :: RawFilePath
|
|
|
|
migrationTreeGraftPoint = "migrate.tree"
|
|
|
|
|
2013-08-29 22:51:22 +00:00
|
|
|
{- The pathname of the location log file for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
locationLogFile :: GitConfig -> Key -> RawFilePath
|
2019-12-11 18:12:22 +00:00
|
|
|
locationLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
branchHashDir config key P.</> keyFile key <> ".log"
|
2013-08-29 22:51:22 +00:00
|
|
|
|
|
|
|
{- The filename of the url log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
urlLogFile :: GitConfig -> Key -> RawFilePath
|
2019-12-11 18:12:22 +00:00
|
|
|
urlLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
branchHashDir config key P.</> keyFile key <> urlLogExt
|
2013-08-29 22:51:22 +00:00
|
|
|
|
|
|
|
{- Old versions stored the urls elsewhere. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
oldurlLogs :: GitConfig -> Key -> [RawFilePath]
|
2019-12-11 18:12:22 +00:00
|
|
|
oldurlLogs config key =
|
|
|
|
[ "remote/web" P.</> hdir P.</> serializeKey' key <> ".log"
|
2019-12-18 20:45:03 +00:00
|
|
|
, "remote/web" P.</> hdir P.</> keyFile key <> ".log"
|
2013-08-29 22:51:22 +00:00
|
|
|
]
|
2015-01-28 21:17:26 +00:00
|
|
|
where
|
|
|
|
hdir = branchHashDir config key
|
2013-08-29 22:51:22 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
urlLogExt :: S.ByteString
|
2013-08-29 22:51:22 +00:00
|
|
|
urlLogExt = ".log.web"
|
|
|
|
|
|
|
|
{- Does not work on oldurllogs. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
isUrlLog :: RawFilePath -> Bool
|
|
|
|
isUrlLog file = urlLogExt `S.isSuffixOf` file
|
2013-08-29 22:51:22 +00:00
|
|
|
|
add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.
GETSTATE and SETSTATE are added to the external special remote protocol.
Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.
The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.
This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.
This commit was sponsored by Daniel Hofer.
2014-01-03 20:35:57 +00:00
|
|
|
{- The filename of the remote state log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteStateLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
remoteStateLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
(branchHashDir config key P.</> keyFile key)
|
2019-11-26 19:27:22 +00:00
|
|
|
<> remoteStateLogExt
|
add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.
GETSTATE and SETSTATE are added to the external special remote protocol.
Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.
The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.
This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.
This commit was sponsored by Daniel Hofer.
2014-01-03 20:35:57 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteStateLogExt :: S.ByteString
|
add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.
GETSTATE and SETSTATE are added to the external special remote protocol.
Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.
The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.
This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.
This commit was sponsored by Daniel Hofer.
2014-01-03 20:35:57 +00:00
|
|
|
remoteStateLogExt = ".log.rmt"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
isRemoteStateLog :: RawFilePath -> Bool
|
|
|
|
isRemoteStateLog path = remoteStateLogExt `S.isSuffixOf` path
|
add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.
GETSTATE and SETSTATE are added to the external special remote protocol.
Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.
The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.
This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.
This commit was sponsored by Daniel Hofer.
2014-01-03 20:35:57 +00:00
|
|
|
|
2014-07-24 20:23:36 +00:00
|
|
|
{- The filename of the chunk log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
chunkLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
chunkLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
(branchHashDir config key P.</> keyFile key)
|
2019-11-26 19:27:22 +00:00
|
|
|
<> chunkLogExt
|
2014-07-24 20:23:36 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
chunkLogExt :: S.ByteString
|
2014-07-24 20:23:36 +00:00
|
|
|
chunkLogExt = ".log.cnk"
|
|
|
|
|
2024-04-06 13:50:58 +00:00
|
|
|
{- The filename of the equivalent keys log for a given key. -}
|
add equivilant key log for VURL keys
When downloading a VURL from the web, make sure that the equivilant key
log is populated.
Unfortunately, this does not hash the content while it's being
downloaded from the web. There is not an interface in Backend currently
for incrementally hash generation, only for incremental verification of an
existing hash. So this might add a noticiable delay, and it has to show
a "(checksum...") message. This could stand to be improved.
But, that separate hashing step only has to happen on the first download
of new content from the web. Once the hash is known, the VURL key can have
its hash verified incrementally while downloading except when the
content in the web has changed. (Doesn't happen yet because
verifyKeyContentIncrementally is not implemented yet for VURL keys.)
Note that the equivilant key log file is formatted as a presence log.
This adds a tiny bit of overhead (eg "1 ") per line over just listing the
urls. The reason I chose to use that format is it seems possible that
there will need to be a way to remove an equivilant key at some point in
the future. I don't know why that would be necessary, but it seemed wise
to allow for the possibility.
Downloads of VURL keys from other special remotes that claim urls,
like bittorrent for example, does not popilate the equivilant key log.
So for now, no checksum verification will be done for those.
Sponsored-by: Nicholas Golder-Manning on Patreon
2024-02-29 19:41:57 +00:00
|
|
|
equivilantKeysLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
equivilantKeysLogFile config key =
|
|
|
|
(branchHashDir config key P.</> keyFile key)
|
|
|
|
<> equivilantKeyLogExt
|
|
|
|
|
|
|
|
equivilantKeyLogExt :: S.ByteString
|
|
|
|
equivilantKeyLogExt = ".log.ek"
|
|
|
|
|
|
|
|
isEquivilantKeyLog :: RawFilePath -> Bool
|
|
|
|
isEquivilantKeyLog path = equivilantKeyLogExt `S.isSuffixOf` path
|
|
|
|
|
2014-02-13 01:12:22 +00:00
|
|
|
{- The filename of the metadata log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
metaDataLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
metaDataLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
(branchHashDir config key P.</> keyFile key)
|
2019-11-26 19:27:22 +00:00
|
|
|
<> metaDataLogExt
|
2014-02-13 01:12:22 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
metaDataLogExt :: S.ByteString
|
2014-02-13 01:12:22 +00:00
|
|
|
metaDataLogExt = ".log.met"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
isMetaDataLog :: RawFilePath -> Bool
|
|
|
|
isMetaDataLog path = metaDataLogExt `S.isSuffixOf` path
|
2018-08-31 16:23:22 +00:00
|
|
|
|
|
|
|
{- The filename of the remote metadata log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteMetaDataLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
remoteMetaDataLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
(branchHashDir config key P.</> keyFile key)
|
2019-11-26 19:27:22 +00:00
|
|
|
<> remoteMetaDataLogExt
|
2018-08-31 16:23:22 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteMetaDataLogExt :: S.ByteString
|
2018-08-31 16:23:22 +00:00
|
|
|
remoteMetaDataLogExt = ".log.rmet"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
isRemoteMetaDataLog :: RawFilePath -> Bool
|
|
|
|
isRemoteMetaDataLog path = remoteMetaDataLogExt `S.isSuffixOf` path
|
2019-02-20 19:36:09 +00:00
|
|
|
|
|
|
|
{- The filename of the remote content identifier log for a given key. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteContentIdentifierLogFile :: GitConfig -> Key -> RawFilePath
|
|
|
|
remoteContentIdentifierLogFile config key =
|
2019-12-18 20:45:03 +00:00
|
|
|
(branchHashDir config key P.</> keyFile key)
|
2019-11-26 19:27:22 +00:00
|
|
|
<> remoteContentIdentifierExt
|
2019-02-20 19:36:09 +00:00
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
remoteContentIdentifierExt :: S.ByteString
|
2019-02-20 19:36:09 +00:00
|
|
|
remoteContentIdentifierExt = ".log.cid"
|
|
|
|
|
2019-11-26 19:27:22 +00:00
|
|
|
isRemoteContentIdentifierLog :: RawFilePath -> Bool
|
|
|
|
isRemoteContentIdentifierLog path = remoteContentIdentifierExt `S.isSuffixOf` path
|
2019-03-06 21:48:46 +00:00
|
|
|
|
|
|
|
{- From an extension and a log filename, get the key that it's a log for. -}
|
2019-11-26 19:27:22 +00:00
|
|
|
extLogFileKey :: S.ByteString -> RawFilePath -> Maybe Key
|
2019-03-06 21:48:46 +00:00
|
|
|
extLogFileKey expectedext path
|
2020-07-07 17:03:33 +00:00
|
|
|
| ext == expectedext = fileKey base
|
2019-03-06 21:48:46 +00:00
|
|
|
| otherwise = Nothing
|
|
|
|
where
|
2020-07-07 17:03:33 +00:00
|
|
|
file = P.takeFileName path
|
|
|
|
(base, ext) = S.splitAt (S.length file - extlen) file
|
2019-11-26 19:27:22 +00:00
|
|
|
extlen = S.length expectedext
|
2019-03-06 21:48:46 +00:00
|
|
|
|
|
|
|
{- Converts a url log file into a key.
|
|
|
|
- (Does not work on oldurlLogs.) -}
|
2019-11-26 19:27:22 +00:00
|
|
|
urlLogFileKey :: RawFilePath -> Maybe Key
|
2019-03-06 21:48:46 +00:00
|
|
|
urlLogFileKey = extLogFileKey urlLogExt
|
|
|
|
|
|
|
|
{- Converts a pathname into a key if it's a location log. -}
|
2020-02-14 19:22:48 +00:00
|
|
|
locationLogFileKey :: GitConfig -> RawFilePath -> Maybe Key
|
|
|
|
locationLogFileKey config path
|
|
|
|
| length (splitDirectories (fromRawFilePath path)) /= locationLogFileDepth config = Nothing
|
2019-03-06 21:48:46 +00:00
|
|
|
| otherwise = extLogFileKey ".log" path
|
2020-02-14 19:22:48 +00:00
|
|
|
|
|
|
|
{- Depth of location log files within the git-annex branch.
|
|
|
|
-
|
|
|
|
- Normally they are xx/yy/key.log so depth 3.
|
|
|
|
- The same extension is also used for other logs that
|
|
|
|
- are not location logs. -}
|
|
|
|
locationLogFileDepth :: GitConfig -> Int
|
|
|
|
locationLogFileDepth config = hashlevels + 1
|
|
|
|
where
|
|
|
|
HashLevels hashlevels = branchHashLevels config
|