avoid git check-ignore overhead on importing known files
isKnownImportLocation does a database lookup and there's an index to make that lookup fast, so it's probably faster than talking to git check-ignore. Checking the matcher is faster still. While before the gitignore check was added it did not need to always check isknown, now it does, because it's that or the more expensive notignored. But at least we can skip notignored when a file is known, which will often be the common case: Importing from a remote that's been exported to, and/or imported from before, only new files will not be known, so only those will need to check notignored. At first, I had this: (matches <&&> (isknown <||> notignored)) <||> isknown Notice that checks isknown every time, whether it matches or not. So, it's no slower to instead do this: isknown <||> (matches <&&> notignored) That has the benefit that, when it's known, it doesn't need to run matches, which while faster than isknown, is still going to use some CPU. And it perhaps more clearly expresses the condition: Any known file is wanted, otherwise it's down to what matches and is not ignored. This commit was sponsored by Jack Hill on Patren.
This commit is contained in:
parent
c56efbbdb6
commit
41271e4eb4
1 changed files with 15 additions and 14 deletions
|
@ -643,25 +643,22 @@ getImportableContents r importtreeconfig ci matcher =
|
|||
<*> mapM (filterunwanted dbhandle) (importableHistory ic)
|
||||
|
||||
wanted dbhandle (loc, (_cid, sz))
|
||||
| ".git" `elem` Posix.splitDirectories (fromImportLocation loc) =
|
||||
pure False
|
||||
| otherwise = wantImport importtreeconfig ci matcher loc sz
|
||||
<||> isKnownImportLocation dbhandle loc
|
||||
| ingitdir = pure False
|
||||
| otherwise =
|
||||
isknown <||> (matches <&&> notignored)
|
||||
where
|
||||
-- Checks, from least to most expensive.
|
||||
ingitdir = ".git" `elem` Posix.splitDirectories (fromImportLocation loc)
|
||||
matches = matchesImportLocation matcher loc sz
|
||||
isknown = isKnownImportLocation dbhandle loc
|
||||
notignored = notIgnoredImportLocation importtreeconfig ci loc
|
||||
|
||||
isKnownImportLocation :: Export.ExportHandle -> ImportLocation -> Annex Bool
|
||||
isKnownImportLocation dbhandle loc = liftIO $
|
||||
not . null <$> Export.getExportTreeKey dbhandle loc
|
||||
|
||||
{- The matcher is matched relative to the top of the tree of files on the
|
||||
- remote, even when importing into a subdirectory.
|
||||
-
|
||||
- However, when checking gitignores, the subdirectory is included
|
||||
- so it will look at the gitignore file in it.
|
||||
-}
|
||||
wantImport :: ImportTreeConfig -> CheckGitIgnore -> FileMatcher Annex -> ImportLocation -> ByteSize -> Annex Bool
|
||||
wantImport importtreeconfig ci matcher loc sz =
|
||||
checkMatcher' matcher mi mempty
|
||||
<&&> (not <$> checkIgnored ci f)
|
||||
matchesImportLocation :: FileMatcher Annex -> ImportLocation -> Integer -> Annex Bool
|
||||
matchesImportLocation matcher loc sz = checkMatcher' matcher mi mempty
|
||||
where
|
||||
mi = MatchingInfo $ ProvidedInfo
|
||||
{ providedFilePath = fromImportLocation loc
|
||||
|
@ -670,6 +667,10 @@ wantImport importtreeconfig ci matcher loc sz =
|
|||
, providedMimeType = Nothing
|
||||
, providedMimeEncoding = Nothing
|
||||
}
|
||||
|
||||
notIgnoredImportLocation :: ImportTreeConfig -> CheckGitIgnore -> ImportLocation -> Annex Bool
|
||||
notIgnoredImportLocation importtreeconfig ci loc = not <$> checkIgnored ci f
|
||||
where
|
||||
f = fromRawFilePath $ case importtreeconfig of
|
||||
ImportSubTree dir _ ->
|
||||
getTopFilePath dir P.</> fromImportLocation loc
|
||||
|
|
Loading…
Reference in a new issue