Future proof activity log parsing
When the log has an activity that is not known, eg added by a future version of git-annex, it used to be treated as no activity at all, which would make git-annex expire think it should expire the repository, despite it having some kind of recent activity. Hopefully there will be no reason to add a new activity until enough time has passed that this commit is in use everywhere. Sponsored-by: Jake Vosloo on Patreon
This commit is contained in:
parent
372ace599a
commit
78da00c7a6
6 changed files with 89 additions and 12 deletions
|
@ -29,6 +29,7 @@ git-annex (8.20210429) UNRELEASED; urgency=medium
|
|||
that creates the git-annex branch.
|
||||
* Added annex.adviceNoSshCaching config.
|
||||
* Added --size-limit option.
|
||||
* Future proof activity log parsing.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Mon, 03 May 2021 10:33:10 -0400
|
||||
|
||||
|
|
|
@ -111,6 +111,6 @@ parseExpire ps = do
|
|||
parseActivity :: MonadFail m => String -> m Activity
|
||||
parseActivity s = case readish s of
|
||||
Nothing -> Fail.fail $ "Unknown activity. Choose from: " ++
|
||||
unwords (map show [minBound..maxBound :: Activity])
|
||||
unwords (map show allActivities)
|
||||
Just v -> return v
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{- git-annex activity log
|
||||
-
|
||||
- Copyright 2015-2019 Joey Hess <id@joeyh.name>
|
||||
- Copyright 2015-2021 Joey Hess <id@joeyh.name>
|
||||
-
|
||||
- Licensed under the GNU AGPL version 3 or higher.
|
||||
-}
|
||||
|
@ -8,6 +8,7 @@
|
|||
module Logs.Activity (
|
||||
Log,
|
||||
Activity(..),
|
||||
allActivities,
|
||||
recordActivity,
|
||||
lastActivities,
|
||||
) where
|
||||
|
@ -23,30 +24,38 @@ import Data.ByteString.Builder
|
|||
|
||||
data Activity
|
||||
= Fsck
|
||||
deriving (Eq, Read, Show, Enum, Bounded)
|
||||
-- Allow for unknown activities to be added later.
|
||||
| UnknownActivity S.ByteString
|
||||
deriving (Eq, Read, Show)
|
||||
|
||||
allActivities :: [Activity]
|
||||
allActivities = [Fsck]
|
||||
|
||||
-- Record an activity. This takes the place of previously recorded activity
|
||||
-- for the UUID.
|
||||
recordActivity :: Activity -> UUID -> Annex ()
|
||||
recordActivity act uuid = do
|
||||
c <- currentVectorClock
|
||||
Annex.Branch.change (Annex.Branch.RegardingUUID [uuid]) activityLog $
|
||||
buildLogOld buildActivity
|
||||
. changeLog c uuid (Right act)
|
||||
. changeLog c uuid act
|
||||
. parseLogOld parseActivity
|
||||
|
||||
-- Most recent activity for each UUID.
|
||||
lastActivities :: Maybe Activity -> Annex (Log Activity)
|
||||
lastActivities wantact = parseLogOld (onlywanted =<< parseActivity)
|
||||
<$> Annex.Branch.get activityLog
|
||||
where
|
||||
onlywanted (Right a) | wanted a = pure a
|
||||
onlywanted _ = fail "unwanted activity"
|
||||
onlywanted a
|
||||
| wanted a = pure a
|
||||
| otherwise = fail "unwanted activity"
|
||||
wanted a = maybe True (a ==) wantact
|
||||
|
||||
buildActivity :: Either S.ByteString Activity -> Builder
|
||||
buildActivity (Right a) = byteString $ encodeBS $ show a
|
||||
buildActivity (Left b) = byteString b
|
||||
buildActivity :: Activity -> Builder
|
||||
buildActivity (UnknownActivity b) = byteString b
|
||||
buildActivity a = byteString $ encodeBS $ show a
|
||||
|
||||
-- Allow for unknown activities to be added later by preserving them.
|
||||
parseActivity :: A.Parser (Either S.ByteString Activity)
|
||||
parseActivity :: A.Parser Activity
|
||||
parseActivity = go <$> A.takeByteString
|
||||
where
|
||||
go b = maybe (Left b) Right $ readish $ decodeBS b
|
||||
go b = fromMaybe (UnknownActivity b) (readish $ decodeBS b)
|
||||
|
|
|
@ -0,0 +1,32 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2021-06-14T17:14:44Z"
|
||||
content="""
|
||||
You can query for repositories that have not been fscked
|
||||
for some amount of time:
|
||||
|
||||
git annex expire 10d --no-act --activity=Fsck
|
||||
|
||||
From there, it's a simple script to set the unfscked ones to untrusted, or
|
||||
whatever.
|
||||
|
||||
| grep '^expire' | awk '{print $2}' | xargs git-annex untrust
|
||||
|
||||
I suppose `git-annex expire` could have an option added, like `--untrust`
|
||||
to specify *how* to expire, rather than the default of marking the repo
|
||||
dead.
|
||||
|
||||
I suppose you'd want a way to also go the other way, to stop untrusting a
|
||||
repo once it's been fscked.. There is not currently a way to do that.
|
||||
|
||||
Note that a fsck that is interrupted does not count as a fsck activity,
|
||||
and it's not keeping track of what files were fscked. That would bloat the
|
||||
git-annex branch. On the other hand, if you `git annex fsck onefile`
|
||||
that counts as a fsck activity, even though other files in the repo didn't get
|
||||
fscked. So you would have to limit the ways you use fsck to ones that
|
||||
generate the activity you want, perhaps to `git annex fsck --all`.
|
||||
|
||||
Perhaps fsck should also have a way to control whether it records an
|
||||
activity or not..
|
||||
"""]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2021-06-14T17:29:29Z"
|
||||
content="""
|
||||
What if `git annex fsck --all` recorded an additional activity, eg FsckAll.
|
||||
Then there could be a command, or a config that untrusts repos that do not
|
||||
have a FsckAll activity that happened recently enough.
|
||||
|
||||
A git config would be simplest, eg:
|
||||
|
||||
git config annex.untrustLastFscked 10d
|
||||
"""]]
|
|
@ -0,0 +1,22 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2021-06-14T17:56:23Z"
|
||||
content="""
|
||||
Tried to implement this, but ran into a problem adding FsckAll:
|
||||
If it only logs FsckAll and not also Fsck, then old git-annex expire
|
||||
will see the FsckAll and not understand it, and treats it as no activity,
|
||||
so expires. (I did fix git-annex now so an unknown activity is not treated
|
||||
as no activity.)
|
||||
|
||||
And, the way recordActivity is implemented, it
|
||||
removes previous activities, and adds the current activity. So a FsckAll
|
||||
followed by a Fsck would remove the FsckAll activity.
|
||||
|
||||
That could be fixed, and both be logged, but old git-annex would probably
|
||||
not be able to parse the result. And if old git-annex is then used to do a
|
||||
fsck, it would log Fsck and remove the previously added FsckAll.
|
||||
|
||||
So, it seems this will need to use some log other than activity.log
|
||||
to keep track of fsck --all.
|
||||
"""]]
|
Loading…
Reference in a new issue