Provide a less expensive version of git annex copy --to, enabled via --fast. This assumes that location tracking information is correct, rather than contacting the remote for every file.

This commit is contained in:
Joey Hess 2011-03-27 18:34:30 -04:00
parent 9a4127f0fe
commit 4868b64868
4 changed files with 42 additions and 9 deletions

View file

@ -84,8 +84,14 @@ toStart dest move file = isAnnexed file $ \(key, _) -> do
return $ Just $ toPerform dest move key
toPerform :: Remote.Remote Annex -> Bool -> Key -> CommandPerform
toPerform dest move key = do
-- checking the remote is expensive, so not done in the start step
isthere <- Remote.hasKey dest key
-- Checking the remote is expensive, so not done in the start step.
-- In fast mode, location tracking is assumed to be correct,
-- and an explicit check is not done, when copying. When moving,
-- it has to be done, to avoid inaverdent data loss.
fast <- Annex.getState Annex.fast
isthere <- if fast && not move
then return $ Right True
else Remote.hasKey dest key
case isthere of
Left err -> do
showNote $ show err

3
debian/changelog vendored
View file

@ -3,6 +3,9 @@ git-annex (0.20110326) UNRELEASED; urgency=low
* annex.diskreserve can be given in arbitrary units (ie "0.5 gigabytes")
* Generalized remotes handling, laying groundwork for remotes that are
not regular git remotes.
* Provide a less expensive version of `git annex copy --to`, enabled
via --fast. This assumes that location tracking information is correct,
rather than contacting the remote for every file.
-- Joey Hess <joeyh@debian.org> Sat, 26 Mar 2011 14:36:16 -0400

View file

@ -6,3 +6,22 @@ Once all checks are done, one single transfer session should be started. Creatin
-- RichiH
> (Use of SHA is irrelevant here, copy does not checksum anything.)
>
> I think what you're seeing is
> that `git annex copy --to remote` is slow, going to the remote repository
> every time to see if it has the file, while `git annex copy --from remote`
> is fast, since it looks at what files are locally present.
>
> That is something I mean to improve. At least `git annex copy --fast --to remote`
> could easily do a fast copy of all files that are known to be missing from
> the remote repository. When local and remote git repos are not 100% in sync,
> relying on that data could miss some files that the remote doesn't have anymore,
> but local doesn't know it dropped. That's why it's a candidate for `--fast`.
>
> I've just implemented that.
>
> While I do hope to improve ssh usage so that it sshs once, and feeds
> `git-annex-shell` a series of commands to run, that is a much longer-term
> thing. --[[Joey]]

View file

@ -84,20 +84,22 @@ Many git-annex commands will stage changes for later `git commit` by you.
it is safe to do so, typically because of the setting of annex.numcopies.
* move [path ...]
When used with the --from option, moves the content of annexed files
from the specified repository to the current one.
When used with the --to option, moves the content of annexed files from
the current repository to the specified one.
When used with the --from option, moves the content of annexed files
from the specified repository to the current one.
* copy [path ...]
When used with the --from option, copies the content of annexed files
from the specified repository to the current one.
When used with the --to option, copies the content of annexed files from
the current repository to the specified one.
When used with the --from option, copies the content of annexed files
from the specified repository to the current one.
To avoid contacting the remote to check if it has every file, specify --fast
* unlock [path ...]
@ -137,11 +139,15 @@ Many git-annex commands will stage changes for later `git commit` by you.
With parameters, only the specified files are checked.
To avoid expensive checksum calculations, specify --fast
* unused
Checks the annex for data that is not used by any files currently
in the annex, and prints a numbered list of the data.
To only show unused temp files, specify --fast
* dropunused [number ...]
Drops the data corresponding to the numbers, as listed by the last
@ -286,8 +292,7 @@ Many git-annex commands will stage changes for later `git commit` by you.
* --fast
Enables less expensive, but also less thorough versions of some commands.
What is avoided depends on the command. A fast fsck avoids calculating
checksums; a fast unused only shows temp files and not other unused files.
What is avoided depends on the command.
* --quiet