2011-10-31 16:33:41 +00:00
|
|
|
{- git-annex command
|
|
|
|
-
|
2016-04-22 17:49:32 +00:00
|
|
|
- Copyright 2011-2016 Joey Hess <id@joeyh.name>
|
2011-10-31 16:33:41 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2011-10-31 16:33:41 +00:00
|
|
|
-}
|
|
|
|
|
2011-10-31 19:18:41 +00:00
|
|
|
module Command.Reinject where
|
2011-10-31 16:33:41 +00:00
|
|
|
|
|
|
|
import Command
|
|
|
|
import Logs.Location
|
|
|
|
import Annex.Content
|
2016-04-22 17:49:32 +00:00
|
|
|
import Backend
|
|
|
|
import Types.KeySource
|
2019-06-25 15:37:52 +00:00
|
|
|
import Utility.Metered
|
2020-01-06 18:22:22 +00:00
|
|
|
import qualified Git
|
2011-10-31 16:33:41 +00:00
|
|
|
|
2015-07-08 19:08:02 +00:00
|
|
|
cmd :: Command
|
|
|
|
cmd = command "reinject" SectionUtility
|
2016-04-22 17:49:32 +00:00
|
|
|
"inject content of file back into annex"
|
2016-11-30 18:16:57 +00:00
|
|
|
(paramRepeating (paramPair "SRC" "DEST"))
|
2016-04-22 17:49:32 +00:00
|
|
|
(seek <$$> optParser)
|
2011-10-31 16:33:41 +00:00
|
|
|
|
2016-04-22 17:49:32 +00:00
|
|
|
data ReinjectOptions = ReinjectOptions
|
|
|
|
{ params :: CmdParams
|
|
|
|
, knownOpt :: Bool
|
|
|
|
}
|
2011-10-31 16:33:41 +00:00
|
|
|
|
2016-04-22 17:49:32 +00:00
|
|
|
optParser :: CmdParamsDesc -> Parser ReinjectOptions
|
|
|
|
optParser desc = ReinjectOptions
|
|
|
|
<$> cmdParams desc
|
|
|
|
<*> switch
|
|
|
|
( long "known"
|
|
|
|
<> help "inject all known files"
|
|
|
|
<> hidden
|
|
|
|
)
|
|
|
|
|
|
|
|
seek :: ReinjectOptions -> CommandSeek
|
|
|
|
seek os
|
2018-10-01 18:12:06 +00:00
|
|
|
| knownOpt os = withStrings (commandAction . startKnown) (params os)
|
|
|
|
| otherwise = withWords (commandAction . startSrcDest) (params os)
|
2016-04-22 17:49:32 +00:00
|
|
|
|
|
|
|
startSrcDest :: [FilePath] -> CommandStart
|
|
|
|
startSrcDest (src:dest:[])
|
2011-10-31 20:46:51 +00:00
|
|
|
| src == dest = stop
|
2019-12-04 17:15:34 +00:00
|
|
|
| otherwise = notAnnexed src $ ifAnnexed (toRawFilePath dest) go stop
|
2017-02-09 19:40:44 +00:00
|
|
|
where
|
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.
To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.
Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)
The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.
One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.
Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.
In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.
Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 19:42:30 +00:00
|
|
|
go key = starting "reinject" (ActionItemOther (Just src)) $
|
|
|
|
ifM (verifyKeyContent RetrievalAllKeysSecure DefaultVerify UnVerified key src)
|
|
|
|
( perform src key
|
|
|
|
, giveup $ src ++ " does not have expected content of " ++ dest
|
|
|
|
)
|
2016-11-16 01:29:54 +00:00
|
|
|
startSrcDest _ = giveup "specify a src file and a dest file"
|
2016-04-22 17:49:32 +00:00
|
|
|
|
|
|
|
startKnown :: FilePath -> CommandStart
|
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.
To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.
Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)
The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.
One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.
Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.
In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.
Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 19:42:30 +00:00
|
|
|
startKnown src = notAnnexed src $
|
|
|
|
starting "reinject" (ActionItemOther (Just src)) $ do
|
2019-06-25 15:37:52 +00:00
|
|
|
mkb <- genKey (KeySource src src Nothing) nullMeterUpdate Nothing
|
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.
To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.
Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)
The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.
One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.
Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.
In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.
Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 19:42:30 +00:00
|
|
|
case mkb of
|
|
|
|
Nothing -> error "Failed to generate key"
|
|
|
|
Just (key, _) -> ifM (isKnownKey key)
|
|
|
|
( perform src key
|
|
|
|
, do
|
|
|
|
warning "Not known content; skipping"
|
|
|
|
next $ return True
|
|
|
|
)
|
2016-04-22 17:49:32 +00:00
|
|
|
|
|
|
|
notAnnexed :: FilePath -> CommandStart -> CommandStart
|
2020-01-06 18:22:22 +00:00
|
|
|
notAnnexed src a =
|
|
|
|
ifM (fromRepo Git.repoIsLocalBare)
|
|
|
|
( a
|
|
|
|
, ifAnnexed (toRawFilePath src)
|
|
|
|
(giveup $ "cannot used annexed file as src: " ++ src)
|
|
|
|
a
|
|
|
|
)
|
2011-10-31 16:33:41 +00:00
|
|
|
|
2017-02-09 19:40:44 +00:00
|
|
|
perform :: FilePath -> Key -> CommandPerform
|
|
|
|
perform src key = ifM move
|
Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
verified then too.
* To get the old, faster, behavior of not verifying checksums, set
annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
setting annex.verify=false.
recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 19:54:37 +00:00
|
|
|
( next $ cleanup key
|
|
|
|
, error "failed"
|
|
|
|
)
|
2012-11-12 05:05:04 +00:00
|
|
|
where
|
annex.securehashesonly
Cryptographically secure hashes can be forced to be used in a repository,
by setting annex.securehashesonly. This does not prevent the git repository
from containing files with insecure hashes, but it does prevent the content
of such files from being pulled into .git/annex/objects from another
repository.
We want to make sure that at no point does git-annex accept content into
.git/annex/objects that is hashed with an insecure key. Here's how it
was done:
* .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be
written to it normally
* So every place that writes content must call, thawContent or modifyContent.
We can audit for these, and be sure we've considered all cases.
* The main functions are moveAnnex, and linkToAnnex; these were made to
check annex.securehashesonly, and are the main security boundary
for annex.securehashesonly.
* Most other calls to modifyContent deal with other files in the KEY
directory (inode cache etc). The other ones that mess with the content
are:
- Annex.Direct.toDirectGen, in which content already in the
annex directory is moved to the direct mode file, so not relevant.
- fix and lock, which don't add new content
- Command.ReKey.linkKey, which manually unlocks it to make a
copy.
* All other calls to thawContent appear safe.
Made moveAnnex return a Bool, so checked all callsites and made them
deal with a failure in appropriate ways.
linkToAnnex simply returns LinkAnnexFailed; all callsites already deal
with it failing in appropriate ways.
This commit was sponsored by Riku Voipio.
2017-02-27 17:01:32 +00:00
|
|
|
move = checkDiskSpaceToGet key False $
|
2017-02-09 19:40:44 +00:00
|
|
|
moveAnnex key src
|
2011-10-31 16:33:41 +00:00
|
|
|
|
2012-09-16 05:17:48 +00:00
|
|
|
cleanup :: Key -> CommandCleanup
|
|
|
|
cleanup key = do
|
2011-10-31 16:33:41 +00:00
|
|
|
logStatus key InfoPresent
|
2012-09-16 05:17:48 +00:00
|
|
|
return True
|