Future proof activity log parsing
When the log has an activity that is not known, eg added by a future version of git-annex, it used to be treated as no activity at all, which would make git-annex expire think it should expire the repository, despite it having some kind of recent activity. Hopefully there will be no reason to add a new activity until enough time has passed that this commit is in use everywhere. Sponsored-by: Jake Vosloo on Patreon
This commit is contained in:
parent
372ace599a
commit
78da00c7a6
6 changed files with 89 additions and 12 deletions
|
@ -29,6 +29,7 @@ git-annex (8.20210429) UNRELEASED; urgency=medium
|
||||||
that creates the git-annex branch.
|
that creates the git-annex branch.
|
||||||
* Added annex.adviceNoSshCaching config.
|
* Added annex.adviceNoSshCaching config.
|
||||||
* Added --size-limit option.
|
* Added --size-limit option.
|
||||||
|
* Future proof activity log parsing.
|
||||||
|
|
||||||
-- Joey Hess <id@joeyh.name> Mon, 03 May 2021 10:33:10 -0400
|
-- Joey Hess <id@joeyh.name> Mon, 03 May 2021 10:33:10 -0400
|
||||||
|
|
||||||
|
|
|
@ -111,6 +111,6 @@ parseExpire ps = do
|
||||||
parseActivity :: MonadFail m => String -> m Activity
|
parseActivity :: MonadFail m => String -> m Activity
|
||||||
parseActivity s = case readish s of
|
parseActivity s = case readish s of
|
||||||
Nothing -> Fail.fail $ "Unknown activity. Choose from: " ++
|
Nothing -> Fail.fail $ "Unknown activity. Choose from: " ++
|
||||||
unwords (map show [minBound..maxBound :: Activity])
|
unwords (map show allActivities)
|
||||||
Just v -> return v
|
Just v -> return v
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
{- git-annex activity log
|
{- git-annex activity log
|
||||||
-
|
-
|
||||||
- Copyright 2015-2019 Joey Hess <id@joeyh.name>
|
- Copyright 2015-2021 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU AGPL version 3 or higher.
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -8,6 +8,7 @@
|
||||||
module Logs.Activity (
|
module Logs.Activity (
|
||||||
Log,
|
Log,
|
||||||
Activity(..),
|
Activity(..),
|
||||||
|
allActivities,
|
||||||
recordActivity,
|
recordActivity,
|
||||||
lastActivities,
|
lastActivities,
|
||||||
) where
|
) where
|
||||||
|
@ -23,30 +24,38 @@ import Data.ByteString.Builder
|
||||||
|
|
||||||
data Activity
|
data Activity
|
||||||
= Fsck
|
= Fsck
|
||||||
deriving (Eq, Read, Show, Enum, Bounded)
|
-- Allow for unknown activities to be added later.
|
||||||
|
| UnknownActivity S.ByteString
|
||||||
|
deriving (Eq, Read, Show)
|
||||||
|
|
||||||
|
allActivities :: [Activity]
|
||||||
|
allActivities = [Fsck]
|
||||||
|
|
||||||
|
-- Record an activity. This takes the place of previously recorded activity
|
||||||
|
-- for the UUID.
|
||||||
recordActivity :: Activity -> UUID -> Annex ()
|
recordActivity :: Activity -> UUID -> Annex ()
|
||||||
recordActivity act uuid = do
|
recordActivity act uuid = do
|
||||||
c <- currentVectorClock
|
c <- currentVectorClock
|
||||||
Annex.Branch.change (Annex.Branch.RegardingUUID [uuid]) activityLog $
|
Annex.Branch.change (Annex.Branch.RegardingUUID [uuid]) activityLog $
|
||||||
buildLogOld buildActivity
|
buildLogOld buildActivity
|
||||||
. changeLog c uuid (Right act)
|
. changeLog c uuid act
|
||||||
. parseLogOld parseActivity
|
. parseLogOld parseActivity
|
||||||
|
|
||||||
|
-- Most recent activity for each UUID.
|
||||||
lastActivities :: Maybe Activity -> Annex (Log Activity)
|
lastActivities :: Maybe Activity -> Annex (Log Activity)
|
||||||
lastActivities wantact = parseLogOld (onlywanted =<< parseActivity)
|
lastActivities wantact = parseLogOld (onlywanted =<< parseActivity)
|
||||||
<$> Annex.Branch.get activityLog
|
<$> Annex.Branch.get activityLog
|
||||||
where
|
where
|
||||||
onlywanted (Right a) | wanted a = pure a
|
onlywanted a
|
||||||
onlywanted _ = fail "unwanted activity"
|
| wanted a = pure a
|
||||||
|
| otherwise = fail "unwanted activity"
|
||||||
wanted a = maybe True (a ==) wantact
|
wanted a = maybe True (a ==) wantact
|
||||||
|
|
||||||
buildActivity :: Either S.ByteString Activity -> Builder
|
buildActivity :: Activity -> Builder
|
||||||
buildActivity (Right a) = byteString $ encodeBS $ show a
|
buildActivity (UnknownActivity b) = byteString b
|
||||||
buildActivity (Left b) = byteString b
|
buildActivity a = byteString $ encodeBS $ show a
|
||||||
|
|
||||||
-- Allow for unknown activities to be added later by preserving them.
|
parseActivity :: A.Parser Activity
|
||||||
parseActivity :: A.Parser (Either S.ByteString Activity)
|
|
||||||
parseActivity = go <$> A.takeByteString
|
parseActivity = go <$> A.takeByteString
|
||||||
where
|
where
|
||||||
go b = maybe (Left b) Right $ readish $ decodeBS b
|
go b = fromMaybe (UnknownActivity b) (readish $ decodeBS b)
|
||||||
|
|
|
@ -0,0 +1,32 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2021-06-14T17:14:44Z"
|
||||||
|
content="""
|
||||||
|
You can query for repositories that have not been fscked
|
||||||
|
for some amount of time:
|
||||||
|
|
||||||
|
git annex expire 10d --no-act --activity=Fsck
|
||||||
|
|
||||||
|
From there, it's a simple script to set the unfscked ones to untrusted, or
|
||||||
|
whatever.
|
||||||
|
|
||||||
|
| grep '^expire' | awk '{print $2}' | xargs git-annex untrust
|
||||||
|
|
||||||
|
I suppose `git-annex expire` could have an option added, like `--untrust`
|
||||||
|
to specify *how* to expire, rather than the default of marking the repo
|
||||||
|
dead.
|
||||||
|
|
||||||
|
I suppose you'd want a way to also go the other way, to stop untrusting a
|
||||||
|
repo once it's been fscked.. There is not currently a way to do that.
|
||||||
|
|
||||||
|
Note that a fsck that is interrupted does not count as a fsck activity,
|
||||||
|
and it's not keeping track of what files were fscked. That would bloat the
|
||||||
|
git-annex branch. On the other hand, if you `git annex fsck onefile`
|
||||||
|
that counts as a fsck activity, even though other files in the repo didn't get
|
||||||
|
fscked. So you would have to limit the ways you use fsck to ones that
|
||||||
|
generate the activity you want, perhaps to `git annex fsck --all`.
|
||||||
|
|
||||||
|
Perhaps fsck should also have a way to control whether it records an
|
||||||
|
activity or not..
|
||||||
|
"""]]
|
|
@ -0,0 +1,13 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2021-06-14T17:29:29Z"
|
||||||
|
content="""
|
||||||
|
What if `git annex fsck --all` recorded an additional activity, eg FsckAll.
|
||||||
|
Then there could be a command, or a config that untrusts repos that do not
|
||||||
|
have a FsckAll activity that happened recently enough.
|
||||||
|
|
||||||
|
A git config would be simplest, eg:
|
||||||
|
|
||||||
|
git config annex.untrustLastFscked 10d
|
||||||
|
"""]]
|
|
@ -0,0 +1,22 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 3"""
|
||||||
|
date="2021-06-14T17:56:23Z"
|
||||||
|
content="""
|
||||||
|
Tried to implement this, but ran into a problem adding FsckAll:
|
||||||
|
If it only logs FsckAll and not also Fsck, then old git-annex expire
|
||||||
|
will see the FsckAll and not understand it, and treats it as no activity,
|
||||||
|
so expires. (I did fix git-annex now so an unknown activity is not treated
|
||||||
|
as no activity.)
|
||||||
|
|
||||||
|
And, the way recordActivity is implemented, it
|
||||||
|
removes previous activities, and adds the current activity. So a FsckAll
|
||||||
|
followed by a Fsck would remove the FsckAll activity.
|
||||||
|
|
||||||
|
That could be fixed, and both be logged, but old git-annex would probably
|
||||||
|
not be able to parse the result. And if old git-annex is then used to do a
|
||||||
|
fsck, it would log Fsck and remove the previously added FsckAll.
|
||||||
|
|
||||||
|
So, it seems this will need to use some log other than activity.log
|
||||||
|
to keep track of fsck --all.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue