Provide a less expensive version of git annex copy --to, enabled via --fast. This assumes that location tracking information is correct, rather than contacting the remote for every file.

This commit is contained in:
Joey Hess 2011-03-27 18:34:30 -04:00
parent 9a4127f0fe
commit 4868b64868
4 changed files with 42 additions and 9 deletions

View file

@ -84,8 +84,14 @@ toStart dest move file = isAnnexed file $ \(key, _) -> do
return $ Just $ toPerform dest move key return $ Just $ toPerform dest move key
toPerform :: Remote.Remote Annex -> Bool -> Key -> CommandPerform toPerform :: Remote.Remote Annex -> Bool -> Key -> CommandPerform
toPerform dest move key = do toPerform dest move key = do
-- checking the remote is expensive, so not done in the start step -- Checking the remote is expensive, so not done in the start step.
isthere <- Remote.hasKey dest key -- In fast mode, location tracking is assumed to be correct,
-- and an explicit check is not done, when copying. When moving,
-- it has to be done, to avoid inaverdent data loss.
fast <- Annex.getState Annex.fast
isthere <- if fast && not move
then return $ Right True
else Remote.hasKey dest key
case isthere of case isthere of
Left err -> do Left err -> do
showNote $ show err showNote $ show err

3
debian/changelog vendored
View file

@ -3,6 +3,9 @@ git-annex (0.20110326) UNRELEASED; urgency=low
* annex.diskreserve can be given in arbitrary units (ie "0.5 gigabytes") * annex.diskreserve can be given in arbitrary units (ie "0.5 gigabytes")
* Generalized remotes handling, laying groundwork for remotes that are * Generalized remotes handling, laying groundwork for remotes that are
not regular git remotes. not regular git remotes.
* Provide a less expensive version of `git annex copy --to`, enabled
via --fast. This assumes that location tracking information is correct,
rather than contacting the remote for every file.
-- Joey Hess <joeyh@debian.org> Sat, 26 Mar 2011 14:36:16 -0400 -- Joey Hess <joeyh@debian.org> Sat, 26 Mar 2011 14:36:16 -0400

View file

@ -6,3 +6,22 @@ Once all checks are done, one single transfer session should be started. Creatin
-- RichiH -- RichiH
> (Use of SHA is irrelevant here, copy does not checksum anything.)
>
> I think what you're seeing is
> that `git annex copy --to remote` is slow, going to the remote repository
> every time to see if it has the file, while `git annex copy --from remote`
> is fast, since it looks at what files are locally present.
>
> That is something I mean to improve. At least `git annex copy --fast --to remote`
> could easily do a fast copy of all files that are known to be missing from
> the remote repository. When local and remote git repos are not 100% in sync,
> relying on that data could miss some files that the remote doesn't have anymore,
> but local doesn't know it dropped. That's why it's a candidate for `--fast`.
>
> I've just implemented that.
>
> While I do hope to improve ssh usage so that it sshs once, and feeds
> `git-annex-shell` a series of commands to run, that is a much longer-term
> thing. --[[Joey]]

View file

@ -84,20 +84,22 @@ Many git-annex commands will stage changes for later `git commit` by you.
it is safe to do so, typically because of the setting of annex.numcopies. it is safe to do so, typically because of the setting of annex.numcopies.
* move [path ...] * move [path ...]
When used with the --from option, moves the content of annexed files
from the specified repository to the current one.
When used with the --to option, moves the content of annexed files from When used with the --to option, moves the content of annexed files from
the current repository to the specified one. the current repository to the specified one.
When used with the --from option, moves the content of annexed files
from the specified repository to the current one.
* copy [path ...] * copy [path ...]
When used with the --from option, copies the content of annexed files
from the specified repository to the current one.
When used with the --to option, copies the content of annexed files from When used with the --to option, copies the content of annexed files from
the current repository to the specified one. the current repository to the specified one.
When used with the --from option, copies the content of annexed files To avoid contacting the remote to check if it has every file, specify --fast
from the specified repository to the current one.
* unlock [path ...] * unlock [path ...]
@ -137,11 +139,15 @@ Many git-annex commands will stage changes for later `git commit` by you.
With parameters, only the specified files are checked. With parameters, only the specified files are checked.
To avoid expensive checksum calculations, specify --fast
* unused * unused
Checks the annex for data that is not used by any files currently Checks the annex for data that is not used by any files currently
in the annex, and prints a numbered list of the data. in the annex, and prints a numbered list of the data.
To only show unused temp files, specify --fast
* dropunused [number ...] * dropunused [number ...]
Drops the data corresponding to the numbers, as listed by the last Drops the data corresponding to the numbers, as listed by the last
@ -286,8 +292,7 @@ Many git-annex commands will stage changes for later `git commit` by you.
* --fast * --fast
Enables less expensive, but also less thorough versions of some commands. Enables less expensive, but also less thorough versions of some commands.
What is avoided depends on the command. A fast fsck avoids calculating What is avoided depends on the command.
checksums; a fast unused only shows temp files and not other unused files.
* --quiet * --quiet