move, copy: Sped up seeking for annexed files to operate on by a factor of nearly 2x.

This commit is contained in:
Joey Hess 2020-07-24 12:56:02 -04:00
parent 00865cdae8
commit d732ef1a89
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 37 additions and 30 deletions

View file

@ -0,0 +1,9 @@
As part of the work in [[precache_logs_for_speed_with_cat-file_--buffer]],
key lookups are now done twice as fast as before.
But, limits that look up keys still do a key lookup, before the key
is looked up efficiently. Avoiding that would speed up --in etc, probably
another 1.5x-2x speedup when such limits are used. What that optimisation
needs is a way to tell if the current limit needs the key or not. If it
does, then match on it after getting the key (and precaching the location
log for limits that need that), otherwise before getting the key.

View file

@ -33,10 +33,13 @@ and precache them.
> > > * `sync --content` 2x speedup!
> > > * `fsck --fast` 1.5x speedup
> > > * `whereis` 1.5x speedup
> > > * `copy --to --fast` twenty-five percent or so speedup
> > > * `copy --to` 2x speedup
> > > * `copy --from` 2x speedup
> > >
> > > Still todo:
> > >
> > > * move, copy, drop, and mirror were left not using the location log caching yet
> > > For copy benchmarks, note that both repos had all files.
> > >
> > > [[done]]
Another thing that the same cat-file --buffer approach could be used with
is to cat the annex links. Git.LsFiles.inRepoDetails provides the Sha
@ -52,10 +55,4 @@ Some calls to lookupKey remain, and the above could
be used to remove them and make it faster. The ones in Annex.View and
Command.Unused seem most likely to be able to be converted.
Also, limits that look up keys still do a key lookup, before the key is
looked up efficiently. (Before these changes, the same key lookup was done
2x too..) Avoiding that would speed up --in etc, probably another 1.5x-2x
speedup when such limits are used. What that optimisation needs is a way to
tell if the current limit needs the key or not. If it does, then match on
it after getting the key (and precaching the location log for limits that
need that), otherwise before getting the key.
See also [[faster_key_lookup_for_limits]]