fix memory leak

filterM is not a good idea if you were streaming in a large list of files.

Fixing this memory leak that I introduced earlier today was a PITA because
to avoid the filterM, it's necessary to do the filtering only after
building up the data structures like BackendFile, and that means each
separate data structure needs it own function to apply the filter,
at least in this naive implementation.

There is also a minor performance regression, when using copy/drop/get/fsck
with a filter, git is now asked to look up attributes for all files,
since that now comes before the filter is applied. This is only a very
minor thing, since getting the attributes is very fast and --exclude was
probably not typically used to speed it up.
This commit is contained in:
Joey Hess 2011-09-18 22:40:31 -04:00
parent 8d1e8c0760
commit 4f1fea1a85
2 changed files with 40 additions and 28 deletions

View file

@ -22,20 +22,19 @@ import Utility
type Limit = Utility.Matcher.Token (FilePath -> Annex Bool)
{- Filter out files not matching user-specified limits. -}
filterFiles :: [FilePath] -> Annex [FilePath]
filterFiles l = do
matcher <- getMatcher
filterM (Utility.Matcher.matchM matcher) l
{- Checks if there are user-specified limits. -}
limited :: Annex Bool
limited = (not . Utility.Matcher.matchesAny) <$> getMatcher
limited = (not . Utility.Matcher.matchesAny) <$> getMatcher'
{- Gets a matcher for the user-specified limits. The matcher is cached for
- speed; once it's obtained the user-specified limits can't change. -}
getMatcher :: Annex (Utility.Matcher.Matcher (FilePath -> Annex Bool))
getMatcher :: Annex (FilePath -> Annex Bool)
getMatcher = do
m <- getMatcher'
return $ Utility.Matcher.matchM m
getMatcher' :: Annex (Utility.Matcher.Matcher (FilePath -> Annex Bool))
getMatcher' = do
m <- Annex.getState Annex.limit
case m of
Right r -> return r