info: Allow using matching options in more situations
File matching options like --include will be rejected in situations where there is no filename to match against. (Or where there is a filename but it's not relative to the cwd, or otherwise seemed too bothersome to match against.) The addition of listKeys' was necessary to avoid using more memory in the common case of "git-annex info". Adding a filterM would have caused the list to buffer in memory and not stream. This is an ugly hack, but listKeys had previously run Annex operations inside unafeInterleaveIO (for direct mode). And matching against a matcher should hopefully not change any Annex state. This does allow for eg `git-annex info somefile --include=*.ext` although why someone would want to do that I don't really know. But it seems to make sense to allow it. But, consider: `git-annex info ./somefile --include=somefile` This does not match, so will not display info about somefile. If the user really wants to, they can `--include=./somefile`. Using matching options like --copies or --in=remote seems likely to be slower than git-annex find with those options, because unlike such commands, info does not have optimised streaming through the matcher. Note that `git-annex info remote` is not the same as `git-annex info --in remote`. The former shows info about all files in the remote. The latter shows local keys that are also in that remote. The output should make that clear, but this still seems like a point where users could get confused. Sponsored-by: Jochen Bartl on Patreon
This commit is contained in:
parent
d36de3edf9
commit
ce1b3a9699
7 changed files with 97 additions and 58 deletions
|
@ -43,6 +43,7 @@ module Annex.Content (
|
||||||
moveBad,
|
moveBad,
|
||||||
KeyLocation(..),
|
KeyLocation(..),
|
||||||
listKeys,
|
listKeys,
|
||||||
|
listKeys',
|
||||||
saveState,
|
saveState,
|
||||||
downloadUrl,
|
downloadUrl,
|
||||||
preseedTmp,
|
preseedTmp,
|
||||||
|
@ -653,22 +654,26 @@ data KeyLocation = InAnnex | InAnywhere
|
||||||
- .git/annex/objects, whether or not the content is present.
|
- .git/annex/objects, whether or not the content is present.
|
||||||
-}
|
-}
|
||||||
listKeys :: KeyLocation -> Annex [Key]
|
listKeys :: KeyLocation -> Annex [Key]
|
||||||
listKeys keyloc = do
|
listKeys keyloc = listKeys' keyloc (const (pure True))
|
||||||
|
|
||||||
|
{- Due to use of unsafeInterleaveIO, the passed filter action
|
||||||
|
- will be run in a copy of the Annex state, so any changes it
|
||||||
|
- makes to the state will not be preserved. -}
|
||||||
|
listKeys' :: KeyLocation -> (Key -> Annex Bool) -> Annex [Key]
|
||||||
|
listKeys' keyloc want = do
|
||||||
dir <- fromRepo gitAnnexObjectDir
|
dir <- fromRepo gitAnnexObjectDir
|
||||||
{- In order to run Annex monad actions within unsafeInterleaveIO,
|
|
||||||
- the current state is taken and reused. No changes made to this
|
|
||||||
- state will be preserved.
|
|
||||||
-}
|
|
||||||
s <- Annex.getState id
|
s <- Annex.getState id
|
||||||
|
r <- Annex.getRead id
|
||||||
depth <- gitAnnexLocationDepth <$> Annex.getGitConfig
|
depth <- gitAnnexLocationDepth <$> Annex.getGitConfig
|
||||||
liftIO $ walk s depth (fromRawFilePath dir)
|
liftIO $ walk (s, r) depth (fromRawFilePath dir)
|
||||||
where
|
where
|
||||||
walk s depth dir = do
|
walk s depth dir = do
|
||||||
contents <- catchDefaultIO [] (dirContents dir)
|
contents <- catchDefaultIO [] (dirContents dir)
|
||||||
if depth < 2
|
if depth < 2
|
||||||
then do
|
then do
|
||||||
contents' <- filterM (present s) contents
|
contents' <- filterM present contents
|
||||||
let keys = mapMaybe (fileKey . P.takeFileName . toRawFilePath) contents'
|
keys <- filterM (Annex.eval s . want) $
|
||||||
|
mapMaybe (fileKey . P.takeFileName . toRawFilePath) contents'
|
||||||
continue keys []
|
continue keys []
|
||||||
else do
|
else do
|
||||||
let deeper = walk s (depth - 1)
|
let deeper = walk s (depth - 1)
|
||||||
|
@ -683,8 +688,8 @@ listKeys keyloc = do
|
||||||
InAnywhere -> True
|
InAnywhere -> True
|
||||||
_ -> False
|
_ -> False
|
||||||
|
|
||||||
present _ _ | inanywhere = pure True
|
present _ | inanywhere = pure True
|
||||||
present _ d = presentInAnnex d
|
present d = presentInAnnex d
|
||||||
|
|
||||||
presentInAnnex = doesFileExist . contentfile
|
presentInAnnex = doesFileExist . contentfile
|
||||||
contentfile d = d </> takeFileName d
|
contentfile d = d </> takeFileName d
|
||||||
|
|
|
@ -11,6 +11,9 @@ git-annex (10.20220128) UNRELEASED; urgency=medium
|
||||||
to be more like other batch commands.
|
to be more like other batch commands.
|
||||||
* registerurl, unregisterurl: Added --json and --json-error-messages options.
|
* registerurl, unregisterurl: Added --json and --json-error-messages options.
|
||||||
* Avoid git status taking a long time after git-annex unlock of many files.
|
* Avoid git status taking a long time after git-annex unlock of many files.
|
||||||
|
* info: Allow using matching options in more situations. File matching
|
||||||
|
options like --include will be rejected in situations where there is
|
||||||
|
no filename to match against.
|
||||||
|
|
||||||
-- Joey Hess <id@joeyh.name> Mon, 31 Jan 2022 13:14:42 -0400
|
-- Joey Hess <id@joeyh.name> Mon, 31 Jan 2022 13:14:42 -0400
|
||||||
|
|
||||||
|
|
100
Command/Info.hs
100
Command/Info.hs
|
@ -1,6 +1,6 @@
|
||||||
{- git-annex command
|
{- git-annex command
|
||||||
-
|
-
|
||||||
- Copyright 2011-2021 Joey Hess <id@joeyh.name>
|
- Copyright 2011-2022 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU AGPL version 3 or higher.
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -132,7 +132,6 @@ start o ps = do
|
||||||
|
|
||||||
globalInfo :: InfoOptions -> Annex ()
|
globalInfo :: InfoOptions -> Annex ()
|
||||||
globalInfo o = do
|
globalInfo o = do
|
||||||
disallowMatchingOptions
|
|
||||||
u <- getUUID
|
u <- getUUID
|
||||||
whenM ((==) DeadTrusted <$> lookupTrust u) $
|
whenM ((==) DeadTrusted <$> lookupTrust u) $
|
||||||
earlyWarning "Warning: This repository is currently marked as dead."
|
earlyWarning "Warning: This repository is currently marked as dead."
|
||||||
|
@ -145,7 +144,6 @@ itemInfo :: InfoOptions -> (SeekInput, String) -> Annex ()
|
||||||
itemInfo o (si, p) = ifM (isdir p)
|
itemInfo o (si, p) = ifM (isdir p)
|
||||||
( dirInfo o p si
|
( dirInfo o p si
|
||||||
, do
|
, do
|
||||||
disallowMatchingOptions
|
|
||||||
v <- Remote.byName' p
|
v <- Remote.byName' p
|
||||||
case v of
|
case v of
|
||||||
Right r -> remoteInfo o r si
|
Right r -> remoteInfo o r si
|
||||||
|
@ -168,10 +166,6 @@ noInfo s si = do
|
||||||
showNote $ "not a directory or an annexed file or a treeish or a remote or a uuid"
|
showNote $ "not a directory or an annexed file or a treeish or a remote or a uuid"
|
||||||
showEndFail
|
showEndFail
|
||||||
|
|
||||||
disallowMatchingOptions :: Annex ()
|
|
||||||
disallowMatchingOptions = whenM Limit.limited $
|
|
||||||
giveup "File matching options can only be used when getting info on a directory."
|
|
||||||
|
|
||||||
dirInfo :: InfoOptions -> FilePath -> SeekInput -> Annex ()
|
dirInfo :: InfoOptions -> FilePath -> SeekInput -> Annex ()
|
||||||
dirInfo o dir si = showCustom (unwords ["info", dir]) si $ do
|
dirInfo o dir si = showCustom (unwords ["info", dir]) si $ do
|
||||||
stats <- selStats
|
stats <- selStats
|
||||||
|
@ -197,9 +191,13 @@ treeishInfo o t si = do
|
||||||
tostats = map (\s -> s t)
|
tostats = map (\s -> s t)
|
||||||
|
|
||||||
fileInfo :: InfoOptions -> FilePath -> SeekInput -> Key -> Annex ()
|
fileInfo :: InfoOptions -> FilePath -> SeekInput -> Key -> Annex ()
|
||||||
fileInfo o file si k = showCustom (unwords ["info", file]) si $ do
|
fileInfo o file si k = do
|
||||||
evalStateT (mapM_ showStat (file_stats file k)) (emptyStatInfo o)
|
matcher <- Limit.getMatcher
|
||||||
return True
|
let file' = toRawFilePath file
|
||||||
|
whenM (matcher $ MatchingFile $ FileInfo file' file' (Just k)) $
|
||||||
|
showCustom (unwords ["info", file]) si $ do
|
||||||
|
evalStateT (mapM_ showStat (file_stats file k)) (emptyStatInfo o)
|
||||||
|
return True
|
||||||
|
|
||||||
remoteInfo :: InfoOptions -> Remote -> SeekInput -> Annex ()
|
remoteInfo :: InfoOptions -> Remote -> SeekInput -> Annex ()
|
||||||
remoteInfo o r si = showCustom (unwords ["info", Remote.name r]) si $ do
|
remoteInfo o r si = showCustom (unwords ["info", Remote.name r]) si $ do
|
||||||
|
@ -404,7 +402,7 @@ bad_data_size :: Stat
|
||||||
bad_data_size = staleSize "bad keys size" gitAnnexBadDir
|
bad_data_size = staleSize "bad keys size" gitAnnexBadDir
|
||||||
|
|
||||||
key_size :: Key -> Stat
|
key_size :: Key -> Stat
|
||||||
key_size k = simpleStat "size" $ showSizeKeys $ foldKeys [k]
|
key_size k = simpleStat "size" $ showSizeKeys $ addKey k emptyKeyInfo
|
||||||
|
|
||||||
key_name :: Key -> Stat
|
key_name :: Key -> Stat
|
||||||
key_name k = simpleStat "key" $ pure $ serializeKey k
|
key_name k = simpleStat "key" $ pure $ serializeKey k
|
||||||
|
@ -525,7 +523,9 @@ cachedPresentData = do
|
||||||
case presentData s of
|
case presentData s of
|
||||||
Just v -> return v
|
Just v -> return v
|
||||||
Nothing -> do
|
Nothing -> do
|
||||||
v <- foldKeys <$> lift (listKeys InAnnex)
|
matcher <- lift getKeyOnlyMatcher
|
||||||
|
v <- foldl' (flip addKey) emptyKeyInfo
|
||||||
|
<$> lift (listKeys' InAnnex (matchOnKey matcher))
|
||||||
put s { presentData = Just v }
|
put s { presentData = Just v }
|
||||||
return v
|
return v
|
||||||
|
|
||||||
|
@ -535,9 +535,13 @@ cachedRemoteData u = do
|
||||||
case M.lookup u (repoData s) of
|
case M.lookup u (repoData s) of
|
||||||
Just v -> return (Right v)
|
Just v -> return (Right v)
|
||||||
Nothing -> do
|
Nothing -> do
|
||||||
|
matcher <- lift getKeyOnlyMatcher
|
||||||
let combinedata d uk = finishCheck uk >>= \case
|
let combinedata d uk = finishCheck uk >>= \case
|
||||||
Nothing -> return d
|
Nothing -> return d
|
||||||
Just k -> return $ addKey k d
|
Just k -> ifM (matchOnKey matcher k)
|
||||||
|
( return (addKey k d)
|
||||||
|
, return d
|
||||||
|
)
|
||||||
lift (loggedKeysFor' u) >>= \case
|
lift (loggedKeysFor' u) >>= \case
|
||||||
Just (ks, cleanup) -> do
|
Just (ks, cleanup) -> do
|
||||||
v <- lift $ foldM combinedata emptyKeyInfo ks
|
v <- lift $ foldM combinedata emptyKeyInfo ks
|
||||||
|
@ -552,8 +556,13 @@ cachedReferencedData = do
|
||||||
case referencedData s of
|
case referencedData s of
|
||||||
Just v -> return v
|
Just v -> return v
|
||||||
Nothing -> do
|
Nothing -> do
|
||||||
|
matcher <- lift getKeyOnlyMatcher
|
||||||
|
let combinedata k _f d = ifM (matchOnKey matcher k)
|
||||||
|
( return (addKey k d)
|
||||||
|
, return d
|
||||||
|
)
|
||||||
!v <- lift $ Command.Unused.withKeysReferenced
|
!v <- lift $ Command.Unused.withKeysReferenced
|
||||||
emptyKeyInfo addKey
|
emptyKeyInfo combinedata
|
||||||
put s { referencedData = Just v }
|
put s { referencedData = Just v }
|
||||||
return v
|
return v
|
||||||
|
|
||||||
|
@ -596,11 +605,16 @@ getDirStatInfo o dir = do
|
||||||
getTreeStatInfo :: InfoOptions -> Git.Ref -> Annex (Maybe StatInfo)
|
getTreeStatInfo :: InfoOptions -> Git.Ref -> Annex (Maybe StatInfo)
|
||||||
getTreeStatInfo o r = do
|
getTreeStatInfo o r = do
|
||||||
fast <- Annex.getState Annex.fast
|
fast <- Annex.getState Annex.fast
|
||||||
|
-- git lstree filenames start with a leading "./" that prevents
|
||||||
|
-- matching, and also things like --include are supposed to
|
||||||
|
-- match relative to the current directory, which does not make
|
||||||
|
-- sense when matching against files in some arbitrary tree.
|
||||||
|
matcher <- getKeyOnlyMatcher
|
||||||
(ls, cleanup) <- inRepo $ LsTree.lsTree
|
(ls, cleanup) <- inRepo $ LsTree.lsTree
|
||||||
LsTree.LsTreeRecursive
|
LsTree.LsTreeRecursive
|
||||||
(LsTree.LsTreeLong False)
|
(LsTree.LsTreeLong False)
|
||||||
r
|
r
|
||||||
(presentdata, referenceddata, repodata) <- go fast ls initial
|
(presentdata, referenceddata, repodata) <- go fast matcher ls initial
|
||||||
ifM (liftIO cleanup)
|
ifM (liftIO cleanup)
|
||||||
( return $ Just $
|
( return $ Just $
|
||||||
StatInfo (Just presentdata) (Just referenceddata) repodata Nothing o
|
StatInfo (Just presentdata) (Just referenceddata) repodata Nothing o
|
||||||
|
@ -608,23 +622,25 @@ getTreeStatInfo o r = do
|
||||||
)
|
)
|
||||||
where
|
where
|
||||||
initial = (emptyKeyInfo, emptyKeyInfo, M.empty)
|
initial = (emptyKeyInfo, emptyKeyInfo, M.empty)
|
||||||
go _ [] vs = return vs
|
go _ _ [] vs = return vs
|
||||||
go fast (l:ls) vs@(presentdata, referenceddata, repodata) = do
|
go fast matcher (l:ls) vs@(presentdata, referenceddata, repodata) =
|
||||||
mk <- catKey (LsTree.sha l)
|
catKey (LsTree.sha l) >>= \case
|
||||||
case mk of
|
Nothing -> go fast matcher ls vs
|
||||||
Nothing -> go fast ls vs
|
Just key -> ifM (matchOnKey matcher key)
|
||||||
Just key -> do
|
( do
|
||||||
!presentdata' <- ifM (inAnnex key)
|
!presentdata' <- ifM (inAnnex key)
|
||||||
( return $ addKey key presentdata
|
( return $ addKey key presentdata
|
||||||
, return presentdata
|
, return presentdata
|
||||||
)
|
)
|
||||||
let !referenceddata' = addKey key referenceddata
|
let !referenceddata' = addKey key referenceddata
|
||||||
!repodata' <- if fast
|
!repodata' <- if fast
|
||||||
then return repodata
|
then return repodata
|
||||||
else do
|
else do
|
||||||
locs <- Remote.keyLocations key
|
locs <- Remote.keyLocations key
|
||||||
return (updateRepoData key locs repodata)
|
return (updateRepoData key locs repodata)
|
||||||
go fast ls $! (presentdata', referenceddata', repodata')
|
go fast matcher ls $! (presentdata', referenceddata', repodata')
|
||||||
|
, go fast matcher ls vs
|
||||||
|
)
|
||||||
|
|
||||||
emptyKeyInfo :: KeyInfo
|
emptyKeyInfo :: KeyInfo
|
||||||
emptyKeyInfo = KeyInfo 0 0 0 M.empty
|
emptyKeyInfo = KeyInfo 0 0 0 M.empty
|
||||||
|
@ -632,9 +648,6 @@ emptyKeyInfo = KeyInfo 0 0 0 M.empty
|
||||||
emptyNumCopiesStats :: NumCopiesStats
|
emptyNumCopiesStats :: NumCopiesStats
|
||||||
emptyNumCopiesStats = NumCopiesStats M.empty
|
emptyNumCopiesStats = NumCopiesStats M.empty
|
||||||
|
|
||||||
foldKeys :: [Key] -> KeyInfo
|
|
||||||
foldKeys = foldl' (flip addKey) emptyKeyInfo
|
|
||||||
|
|
||||||
addKey :: Key -> KeyInfo -> KeyInfo
|
addKey :: Key -> KeyInfo -> KeyInfo
|
||||||
addKey key (KeyInfo count size unknownsize backends) =
|
addKey key (KeyInfo count size unknownsize backends) =
|
||||||
KeyInfo count' size' unknownsize' backends'
|
KeyInfo count' size' unknownsize' backends'
|
||||||
|
@ -700,3 +713,20 @@ mkSizer = ifM (bytesOption . infoOptions <$> get)
|
||||||
( return (const $ const show)
|
( return (const $ const show)
|
||||||
, return roughSize
|
, return roughSize
|
||||||
)
|
)
|
||||||
|
|
||||||
|
getKeyOnlyMatcher :: Annex (MatchInfo -> Annex Bool)
|
||||||
|
getKeyOnlyMatcher = do
|
||||||
|
whenM (Limit.introspect matchNeedsFileName) $ do
|
||||||
|
warning "File matching options cannot be applied when getting this info."
|
||||||
|
giveup "Unable to continue."
|
||||||
|
Limit.getMatcher
|
||||||
|
|
||||||
|
matchOnKey :: (MatchInfo -> Annex Bool) -> Key -> Annex Bool
|
||||||
|
matchOnKey matcher k = matcher $ MatchingInfo $ ProvidedInfo
|
||||||
|
{ providedFilePath = Nothing
|
||||||
|
, providedKey = Just k
|
||||||
|
, providedFileSize = Nothing
|
||||||
|
, providedMimeType = Nothing
|
||||||
|
, providedMimeEncoding = Nothing
|
||||||
|
, providedLinkType = Nothing
|
||||||
|
}
|
||||||
|
|
|
@ -183,12 +183,10 @@ excludeReferenced refspec ks = runbloomfilter withKeysReferencedM ks
|
||||||
runfilter a l = a l
|
runfilter a l = a l
|
||||||
runbloomfilter a = runfilter $ \l -> bloomFilter l <$> genBloomFilter a
|
runbloomfilter a = runfilter $ \l -> bloomFilter l <$> genBloomFilter a
|
||||||
|
|
||||||
{- Given an initial value, folds it with each key referenced by
|
{- Given an initial value, accumulates the value over each key
|
||||||
- files in the working tree. -}
|
- referenced by files in the working tree. -}
|
||||||
withKeysReferenced :: v -> (Key -> v -> v) -> Annex v
|
withKeysReferenced :: v -> (Key -> RawFilePath -> v -> Annex v) -> Annex v
|
||||||
withKeysReferenced initial a = withKeysReferenced' Nothing initial folda
|
withKeysReferenced initial = withKeysReferenced' Nothing initial
|
||||||
where
|
|
||||||
folda k _ v = return $ a k v
|
|
||||||
|
|
||||||
{- Runs an action on each referenced key in the working tree. -}
|
{- Runs an action on each referenced key in the working tree. -}
|
||||||
withKeysReferencedM :: (Key -> Annex ()) -> Annex ()
|
withKeysReferencedM :: (Key -> Annex ()) -> Annex ()
|
||||||
|
|
|
@ -11,9 +11,9 @@ git annex info `[directory|file|treeish|remote|description|uuid ...]`
|
||||||
Displays statistics and other information for the specified item,
|
Displays statistics and other information for the specified item,
|
||||||
which can be a directory, or a file, or a treeish, or a remote,
|
which can be a directory, or a file, or a treeish, or a remote,
|
||||||
or the description or uuid of a repository.
|
or the description or uuid of a repository.
|
||||||
|
|
||||||
When no item is specified, displays statistics and information
|
When no item is specified, displays statistics and information
|
||||||
for the local repository and all known annexed files.
|
for the local repository and all annexed content.
|
||||||
|
|
||||||
# OPTIONS
|
# OPTIONS
|
||||||
|
|
||||||
|
@ -45,11 +45,10 @@ for the local repository and all known annexed files.
|
||||||
Makes the `--batch` input be delimited by nulls instead of the usual
|
Makes the `--batch` input be delimited by nulls instead of the usual
|
||||||
newlines.
|
newlines.
|
||||||
|
|
||||||
* file matching options
|
* matching options
|
||||||
|
|
||||||
When a directory is specified, the [[git-annex-matching-options]](1)
|
The [[git-annex-matching-options]](1) can be used to select what
|
||||||
can be used to select the files in the directory that are included
|
to include in the statistics.
|
||||||
in the statistics.
|
|
||||||
|
|
||||||
* Also the [[git-annex-common-options]](1) can be used.
|
* Also the [[git-annex-common-options]](1) can be used.
|
||||||
|
|
||||||
|
|
|
@ -9,3 +9,5 @@ git-annex: File matching options can only be used when getting info on a directo
|
||||||
There should be a way to use `info` to query aggregate information properties of all keys instead of directories.
|
There should be a way to use `info` to query aggregate information properties of all keys instead of directories.
|
||||||
|
|
||||||
I have used `git annex info .` in the repos I used up until now because every key was in the tree. Though I also have a feeling that operating on all keys could be significantly faster than filtering them to match some directory.
|
I have used `git annex info .` in the repos I used up until now because every key was in the tree. Though I also have a feeling that operating on all keys could be significantly faster than filtering them to match some directory.
|
||||||
|
|
||||||
|
> [[done]] --[[Joey]]
|
||||||
|
|
|
@ -12,6 +12,8 @@ where used, but now `matchNeedsFileName` is available and it could only
|
||||||
reject those.
|
reject those.
|
||||||
|
|
||||||
So this can be implemented by making cachedPresentData
|
So this can be implemented by making cachedPresentData
|
||||||
and cachedRemoteData get the matcher, check if it's
|
and cachedRemoteData (etc) get the matcher, check if it's
|
||||||
the right kind and apply it to the keys.
|
the right kind and apply it to the keys.
|
||||||
|
|
||||||
|
done
|
||||||
"""]]
|
"""]]
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue