better key matching with a regexp

Handles keys that are substrings of other keys, as well as pointer files
that contain a newline after the key.

Note that -S does not match regexp, while -G does by default. Docs are
not clear, determined experimentally. The only other difference in
changing to -G is that if a file used to contain the key and changed
in some way, while still containing the key, -G will match and -S would
not. So eg, annex links that git annex fix rewrites will match, and
files that change lock status will match. Which is an improvement anyway.

Sponsored-by: Jochen Bartl on Patreon
This commit is contained in:
Joey Hess 2021-07-14 16:28:07 -04:00
parent 7a46bb1b28
commit 274d2380c7
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 17 additions and 2 deletions

View file

@ -140,11 +140,24 @@ searchLog key ps a = Annex.inRepo $ \repo -> do
, Param "--no-abbrev"
-- Only find the most recent commit, for speed.
, Param "-n1"
-- Find commits that contain the key.
, Param ("-S" ++ fromRawFilePath (keyFile key))
-- Be sure to treat -G as a regexp.
, Param "--basic-regexp"
-- Find commits that contain the key. The object has to
-- end with the key to avoid confusion with longer keys,
-- so a regexp is used. Since annex pointer files
-- may contain a newline followed by perhaps something
-- else, that is also matched.
, Param ("-G" ++ escapeRegexp (fromRawFilePath (keyFile key)) ++ "($|\n)")
-- Skip commits where the file was deleted,
-- only find those where it was added or modified.
, Param "--diff-filter=ACMRTUX"
-- Output the raw diff.
, Param "--raw"
] ++ ps
escapeRegexp :: String -> String
escapeRegexp = concatMap esc
where
esc c
| isAscii c && isAlphaNum c = [c]
| otherwise = ['[', c, ']']

View file

@ -5,3 +5,5 @@ One way this is likely to happen is SHA256 keys with and without extension,
for the same content. Or WORM keys with similar filenames.
--[[Joey]]
> [[fixed|done]] --[[Joey]]