Revert "Significantly sped up processing of large numbers of directories passed to a single git-annex command."
This reverts commit 705112903e
.
Whoops, git ls-files does not always output in the input ordering.
That's why all this work is needed. Urk.
This commit is contained in:
parent
8aa6b5f2a6
commit
f79502d377
3 changed files with 3 additions and 11 deletions
|
@ -170,19 +170,17 @@ prop_relPathDirToFile_regressionTest = same_dir_shortcurcuits_at_difference
|
|||
== joinPath ["..", "..", "..", "..", ".git", "annex", "objects", "18", "gk", "SHA256-foo", "SHA256-foo"]
|
||||
|
||||
{- Given an original list of paths, and an expanded list derived from it,
|
||||
- partitions the expanded list, so that sublist corresponds to one of the
|
||||
- generates a list of lists, where each sublist corresponds to one of the
|
||||
- original paths. When the original path is a directory, any items
|
||||
- in the expanded list that are contained in that directory will appear in
|
||||
- its segment.
|
||||
-
|
||||
- The expanded list must have the same ordering as the original list.
|
||||
-}
|
||||
segmentPaths :: [FilePath] -> [FilePath] -> [[FilePath]]
|
||||
segmentPaths [] new = [new]
|
||||
segmentPaths [_] new = [new] -- optimisation
|
||||
segmentPaths (l:ls) new = found : segmentPaths ls rest
|
||||
where
|
||||
(found, rest) = break (\p -> not (l `dirContains` p)) new
|
||||
(found, rest)=partition (l `dirContains`) new
|
||||
|
||||
{- This assumes that it's cheaper to call segmentPaths on the result,
|
||||
- than it would be to run the action separately with each path. In
|
||||
|
|
2
debian/changelog
vendored
2
debian/changelog
vendored
|
@ -22,8 +22,6 @@ git-annex (5.20150328) UNRELEASED; urgency=medium
|
|||
corresponding to duplicated files they process.
|
||||
* fsck: Added --distributed and --expire options,
|
||||
for distributed fsck.
|
||||
* Significantly sped up processing of large numbers of directories
|
||||
passed to a single git-annex command.
|
||||
* Fix truncation of parameters that could occur when using xargs git-annex.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Fri, 27 Mar 2015 16:04:43 -0400
|
||||
|
|
|
@ -5,13 +5,9 @@ Feeding git-annex a long list off directories, eg with xargs can have
|
|||
ls-files command is longer than the git-annex command often, so it gets
|
||||
truncated and some files are not processed.
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
||||
> fixed --[[Joey]]
|
||||
|
||||
* It can take a really long time for git-annex to chew through the
|
||||
git-ls-files results. There is probably an exponential blowup in the time
|
||||
relative to the number of parameters. Some of the stuff being done to
|
||||
preserve original ordering etc is likely at fault.
|
||||
|
||||
> I think I've managed to speed this up something like
|
||||
> 1000x or some such. segmentPaths on an utterly insane list of 6 million
|
||||
> files now runs in about 10 seconds. --[[Joey]]
|
||||
|
|
Loading…
Reference in a new issue