safe recv-key in direct mode

Checks the key's size and checksum. This is sorta expensive, but it avoids
needing to add another round-trip to the protocol.
This commit is contained in:
Joey Hess 2013-01-11 15:43:09 -04:00
parent 043c9562f3
commit 18a6935e42
7 changed files with 71 additions and 34 deletions

View file

@ -14,6 +14,10 @@ import Annex.Content
import Utility.Rsync import Utility.Rsync
import Logs.Transfer import Logs.Transfer
import Command.SendKey (fieldTransfer) import Command.SendKey (fieldTransfer)
import qualified Fields
import qualified Types.Key
import qualified Types.Backend
import qualified Backend
def :: [Command] def :: [Command]
def = [noCommit $ command "recvkey" paramKey seek def = [noCommit $ command "recvkey" paramKey seek
@ -26,7 +30,7 @@ start :: Key -> CommandStart
start key = ifM (inAnnex key) start key = ifM (inAnnex key)
( error "key is already present in annex" ( error "key is already present in annex"
, fieldTransfer Download key $ \_p -> do , fieldTransfer Download key $ \_p -> do
ifM (getViaTmp key $ liftIO . rsyncServerReceive) ifM (getViaTmp key go)
( do ( do
-- forcibly quit after receiving one key, -- forcibly quit after receiving one key,
-- and shutdown cleanly -- and shutdown cleanly
@ -35,3 +39,28 @@ start key = ifM (inAnnex key)
, return False , return False
) )
) )
where
go tmp = ifM (liftIO $ rsyncServerReceive tmp)
( ifM (isJust <$> Fields.getField Fields.direct)
( directcheck tmp
, return True
)
, return False
)
{- If the sending repository uses direct mode, the file
- it sends could be modified as it's sending it. So check
- that the right size file was received, and that the key/value
- Backend is happy with it. -}
directcheck tmp = do
oksize <- case Types.Key.keySize key of
Nothing -> return True
Just size -> do
size' <- fromIntegral . fileSize
<$> liftIO (getFileStatus tmp)
return $ size == size'
if oksize
then case Backend.maybeLookupBackendName (Types.Key.keyBackendName key) of
Nothing -> return False
Just backend -> maybe (return True) (\a -> a key tmp)
(Types.Backend.fsckKey backend)
else return False

View file

@ -30,3 +30,6 @@ associatedFile :: Field
associatedFile = Field "associatedfile" $ \f -> associatedFile = Field "associatedfile" $ \f ->
-- is the file a safe relative filename? -- is the file a safe relative filename?
not (isAbsolute f) && not ("../" `isPrefixOf` f) not (isAbsolute f) && not ("../" `isPrefixOf` f)
direct :: Field
direct = Field "direct" $ \f -> f == "1"

View file

@ -122,6 +122,7 @@ checkField :: (String, String) -> Bool
checkField (field, value) checkField (field, value)
| field == fieldName remoteUUID = fieldCheck remoteUUID value | field == fieldName remoteUUID = fieldCheck remoteUUID value
| field == fieldName associatedFile = fieldCheck associatedFile value | field == fieldName associatedFile = fieldCheck associatedFile value
| field == fieldName direct = fieldCheck direct value
| otherwise = False | otherwise = False
failure :: IO () failure :: IO ()

View file

@ -398,7 +398,9 @@ rsyncOrCopyFile rsyncparams src dest p =
rsyncParamsRemote :: Remote -> Direction -> Key -> FilePath -> AssociatedFile -> Annex [CommandParam] rsyncParamsRemote :: Remote -> Direction -> Key -> FilePath -> AssociatedFile -> Annex [CommandParam]
rsyncParamsRemote r direction key file afile = do rsyncParamsRemote r direction key file afile = do
u <- getUUID u <- getUUID
direct <- isDirect
let fields = (Fields.remoteUUID, fromUUID u) let fields = (Fields.remoteUUID, fromUUID u)
: (Fields.direct, if direct then "1" else "")
: maybe [] (\f -> [(Fields.associatedFile, f)]) afile : maybe [] (\f -> [(Fields.associatedFile, f)]) afile
Just (shellcmd, shellparams) <- git_annex_shell (repo r) Just (shellcmd, shellparams) <- git_annex_shell (repo r)
(if direction == Download then "sendkey" else "recvkey") (if direction == Download then "sendkey" else "recvkey")

6
debian/changelog vendored
View file

@ -1,8 +1,10 @@
git-annex (3.20130108) UNRELEASED; urgency=low git-annex (3.20130108) UNRELEASED; urgency=low
* Now handles the case where a file that's being transferred to a remote
is modified in place, which direct mode allows to happen. When this
happens, the transfer now fails, rather than allow possibly corrupt
data into the remote.
* fsck: Better checking of file content in direct mode. * fsck: Better checking of file content in direct mode.
* Special remotes now all rollback storage of keys that get modified
during the transfer, which can happen in direct mode.
* drop: Suggest using git annex move when numcopies prevents dropping a file. * drop: Suggest using git annex move when numcopies prevents dropping a file.
* webapp: Repo switcher filters out repos that do not exist any more * webapp: Repo switcher filters out repos that do not exist any more
(or are on a drive that's not mounted). (or are on a drive that's not mounted).

View file

@ -84,6 +84,32 @@ is converted to a real file when it becomes present.
## TODO ## TODO
* kqueue does not deliver an event when an existing file is modified.
This doesn't affect OSX, which uses FSEvents now, but it makes direct
mode assistant not 100% on other BSD's.
## done
* `git annex sync` updates the key to files mappings for files changed,
but needs much other work to handle direct mode:
* Generate git commit, without running `git commit`, because it will
want to stage the full files. **done**
* Update location logs for any files deleted by a commit. **done**
* Generate a git merge, without running `git merge` (or possibly running
it in a scratch repo?), because it will stumble over the direct files.
**done**
* Drop contents of files deleted by a merge (including updating the
location log), or if we cannot drop,
move their contents to `.git/annex/objects/`. **no** .. instead,
avoid ever losing file contents in a direct mode merge. If the file is
deleted, its content is moved back to .git/annex/objects, if necessary.
* When a merge adds a symlink pointing at a key that is present in the
repo, replace the symlink with the direct file (either moving out
of `.git/annex/objects/` or hard-linking if the same key is present
elsewhere in the tree. **done**
* handle merge conflicts on direct mode files **done**
* support direct mode in the assistant (many little fixes)
* Deal with files changing as they're being transferred from a direct mode * Deal with files changing as they're being transferred from a direct mode
repository to another git repository. The remote repo currently will repository to another git repository. The remote repo currently will
accept the bad data and update the location log to say it has the key. accept the bad data and update the location log to say it has the key.
@ -113,34 +139,7 @@ is converted to a real file when it becomes present.
the temp file, which is probably corrupt. (Could in future use it as a the temp file, which is probably corrupt. (Could in future use it as a
basis for transferring the new key..) **done** basis for transferring the new key..) **done**
For git remotes, add a flag to `git-annex-shell recvkey` (using a field For git remotes, added a flag to `git-annex-shell recvkey` (using a field
after the "--" to remain back-compat). With this flag, after receiving after the "--" to remain back-compat). With this flag, after receiving
the data, the remote should wait for a signal that the data is good the data, the remote fscks the data. This is not optimal, but avoids
before it updates the location log. The signal could just be a "1" needing another round-trip, or a protocol change.
sent over the ssh channel. Or another `git-annex-shell` command. **TODO**
* kqueue does not deliver an event when an existing file is modified.
This doesn't affect OSX, which uses FSEvents now, but it makes direct
mode assistant not 100% on other BSD's.
## done
* `git annex sync` updates the key to files mappings for files changed,
but needs much other work to handle direct mode:
* Generate git commit, without running `git commit`, because it will
want to stage the full files. **done**
* Update location logs for any files deleted by a commit. **done**
* Generate a git merge, without running `git merge` (or possibly running
it in a scratch repo?), because it will stumble over the direct files.
**done**
* Drop contents of files deleted by a merge (including updating the
location log), or if we cannot drop,
move their contents to `.git/annex/objects/`. **no** .. instead,
avoid ever losing file contents in a direct mode merge. If the file is
deleted, its content is moved back to .git/annex/objects, if necessary.
* When a merge adds a symlink pointing at a key that is present in the
repo, replace the symlink with the direct file (either moving out
of `.git/annex/objects/` or hard-linking if the same key is present
elsewhere in the tree. **done**
* handle merge conflicts on direct mode files **done**
* support direct mode in the assistant (many little fixes)

View file

@ -76,7 +76,8 @@ to git-annex-shell are:
past versions of git-annex-shell (that ignore these, but would choke past versions of git-annex-shell (that ignore these, but would choke
on new dashed options). on new dashed options).
Currently used fields include remoteuuid= and associatedfile= Currently used fields include remoteuuid=, associatedfile=,
and direct=
# HOOK # HOOK