Commit graph

149 commits

Author SHA1 Message Date
Joey Hess
53526136e8
move commandAction out of CmdLine.Seek
This is groundwork for nested seek loops, eg seeking over all files and
then performing commandActions on a list of remotes, which can be done
concurrently.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2018-10-01 14:12:06 -04:00
Joey Hess
1d1054faa6
added -z
Added -z option to git-annex commands that use --batch, useful for
supporting filenames containing newlines.

It only controls input to --batch, the output will still be line delimited
unless --json or etc is used to get some other output. While git often
makes -z affect both input and output, I don't like trying them together,
and making it affect output would have been a significant complication,
and also git-annex output is generally not intended to be machine parsed,
unless using --json or a format option.

Commands that take pairs like "file key" still separate them with a space
in --batch mode. All such commands take care to support filenames with
spaces when parsing that, so there was no need to change it, and it would
have needed significant changes to the batch machinery to separate tose
with a null.

To make fromkey and registerurl support -z, I had to give them a --batch
option. The implicit batch mode they enter when not provided with input
parameters does not support -z as that would have complicated option
parsing. Seemed better to move these toward using the same --batch as
everything else, though the implicit batch mode can still be used.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-09-20 16:11:47 -04:00
Joey Hess
12460fcea6
make --batch honor matching options
When --batch is used with matching options like --in, --metadata, etc, only
operate on the provided files when they match those options. Otherwise, a
blank line is output in the batch protocol.

Affected commands: find, add, whereis, drop, copy, move, get

In the case of find, the documentation for --batch already said it honored
the matching options. The docs for the rest didn't, but it makes sense to
have them honor them. While this is a behavior change, why specify the
matching options with --batch if you didn't want them to apply?

Note that the batch output for all of the affected commands could
already output a blank line in other cases, so batch users should
already be prepared to deal with it.

git-annex metadata didn't seem worth making support the matching options,
since all it does is output metadata or set metadata, the use cases for
using it in combination with the martching options seem small. Made it
refuse to run when they're combined, leaving open the possibility for later
support if a use case develops.

This commit was sponsored by Brett Eisenberg on Patreon.
2018-08-08 12:07:06 -04:00
Joey Hess
b657242f5d
enforce retrievalSecurityPolicy
Leveraged the existing verification code by making it also check the
retrievalSecurityPolicy.

Also, prevented getViaTmp from running the download action at all when the
retrievalSecurityPolicy is going to prevent verifying and so storing it.

Added annex.security.allow-unverified-downloads. A per-remote version
would be nice to have too, but would need more plumbing, so KISS.
(Bill the Cat reference not too over the top I hope. The point is to
make this something the user reads the documentation for before using.)

A few calls to verifyKeyContent and getViaTmp, that don't
involve downloads from remotes, have RetrievalAllKeysSecure hard-coded.
It was also hard-coded for P2P.Annex and Command.RecvKey,
to match the values of the corresponding remotes.

A few things use retrieveKeyFile/retrieveKeyFileCheap without going
through getViaTmp.
* Command.Fsck when downloading content from a remote to verify it.
  That content does not get into the annex, so this is ok.
* Command.AddUrl when using a remote to download an url; this is new
  content being added, so this is ok.

This commit was sponsored by Fernando Jimenez on Patreon.
2018-06-21 13:37:01 -04:00
Joey Hess
2fabd7cdb5
remove the older move --force, which never behaved as documented and seems useless
* move: --force was accidentially enabling two unrelated behaviors
  since 6.20180427. The older behavior, which has never been well
  documented and seems almost entirely useless, has been removed.
* copy: --force no longer does anything.

This commit was sponsored by Øyvind Andersen Holm.
2018-05-21 13:21:19 -04:00
Joey Hess
f56594af9e
finish fixing inverted Ord for TrustLevel
Flipped all comparisons. When a TrustLevel list was wanted from Trusted
downwards, used Down to compare it in that order.

This commit was sponsored by mo on Patreon.
2018-04-13 15:17:54 -04:00
Joey Hess
a0e4b9678b
fix inverted Ord for TrustLevel (intermediate commit)
This commit removes the Ord and Enum instances, commenting out all code
that depends on them, to make sure that all code effected by the
inversion fix has been identified.

(Assuming no ifdefs involve TrustLevel.)

The next commit will fix up all the identified code.
2018-04-13 14:50:14 -04:00
Joey Hess
64980db7d9
move: Avoid drops that make bad situations worse, but otherwise allow
See the big comment at the bottom of Command.Drop for the full details.

(The --safe/--unsafe options were never released.)

This commit was sponsored by Jake Vosloo on Patreon.
2018-04-13 14:36:43 -04:00
Joey Hess
af8546990d
move: --safe/--unsafe and potential drop race fix
move: Added --safe option, which makes move honor numcopies settings.
Also --unsafe enables the default behavior, anticipating that the
default may one day change.

This commit was sponsored by Ethan Aubin.
2018-04-09 16:20:10 -04:00
Joey Hess
ae530f043e
disentagle copy and move option parsing 2018-04-09 14:38:46 -04:00
Joey Hess
0106752db2
refactor FromToHereOptions 2018-04-09 14:29:28 -04:00
Joey Hess
46d4316954
implement annex.retry et al
Added annex.retry, annex.retry-delay, and per-remote versions to configure
transfer retries.

This commit was supported by the NSF-funded DataLad project.
2018-03-29 13:04:07 -04:00
Joey Hess
6583448bab
add --json-error-messages (not yet implemented)
Added --json-error-messages option, which includes error messages in the
json output, rather than outputting them to stderr.

The actual rediretion of errors is not implemented yet, this is only
the docs and option plumbing.

This commit was supported by the NSF-funded DataLad project.
2018-02-19 14:32:15 -04:00
Joey Hess
4781ca297b
showStart variant for when there's no worktree file
Clean up some uses of showStart with "" for the file,
or in some cases, a non-filename description string. That would
generate bad json, although none of the commands doing that
supported --json.

Using "" for the file resulted in output like "foo  rest";
now the extra space is eliminated.

This commit was sponsored by Fernando Jimenez on Patreon.
2017-11-28 15:14:16 -04:00
Joey Hess
e1ac299ad0
better dup key with -J fix
This avoids all the complication about redundant work discussed in
the previous try at fixing this. At the expense of needing each command
that could have the problem to be patched to simply wrap the action in
onlyActionOn once the key is known. But there do not seem to be many
such commands.

onlyActionOn' should not be used with a CommandStart (or CommandPerform),
although the types do allow it. onlyActionOn handles running the whole
CommandStart chain. I couldn't immediately see a way to avoid mistken
use of onlyActionOn'.

This commit was supported by the NSF-funded DataLad project.
2017-10-17 18:48:53 -04:00
Joey Hess
68a49adcda
Improve behavior when -J transfers multiple files that point to the same key
After a false start, I found a fairly non-intrusive way to deal with it.
Although it only handles transfers -- there may be issues with eg
concurrent dropping of the same key, or other operations.

There is no added overhead when -J is not used, other than an added
inAnnex check. When -J is used, it has to maintain and check a small
Set, which should be negligible overhead.

It could output some message saying that the transfer is being done by
another thread. Or it could even display the same progress info for both
files that are being downloaded since they have the same content. But I
opted to keep it simple, since this is rather an edge case, so it just
doesn't say anything about the transfer of the file until the other
thread finishes.

Since the deferred transfer action still runs, actions that do more than
transfer content will still get a chance to do their other work. (An
example of something that needs to do such other work is P2P.Annex,
where the download always needs to receive the content from the peer.)
And, if the first thread fails to complete a transfer, the second thread
can resume it.

But, this unfortunately means that there's a risk of redundant work
being done to transfer a key that just got transferred.
That's not ideal, but should never cause breakage; the same
thing can occur when running two separate git-annex processes.

The get/move/copy/mirror --from commands had extra inAnnex checks added,
inside the download actions. Without those checks, the first thread
downloaded the content, and then the second thread woke up and
downloaded the same content redundantly.

move/copy/mirror --to is left doing redundant uploads for now. It
would need a second checkPresent of the remote inside the upload
to avoid them, which would be expensive. A better way to avoid
redundant work needs to be found..

This commit was supported by the NSF-funded DataLad project.
2017-10-17 17:10:50 -04:00
Joey Hess
85ed38a574
Avoid repeated checking that files passed on the command line exist.
git annex add, git annex lock etc make multiple seek passes,
and each seek pass checked that files existed. That was unncessary
redundant work.

Fixed by adding a new WorkTreeItem type, make seek actions use it,
and check that the files exist when constructing it.

This commit was supported by the NSF-funded DataLad project.
2017-10-16 14:10:20 -04:00
Joey Hess
f403c23bc6
copy, move: Behave same with --fast when sending to remotes located on a local disk as when sending to other remotes.
Let --fast override use of hasKey even when hasKeyCheap.
2017-09-29 16:30:43 -04:00
Joey Hess
2eb6309d3e
move, copy: Support --batch. 2017-08-15 12:39:10 -04:00
Joey Hess
02df5c5932
support --to=. as shorthand for --to=here 2017-06-01 13:12:42 -04:00
Joey Hess
bb18026b2c
move --to=here
* move --to=here moves from all reachable remotes to the local repository.

The output of move --from remote is changed slightly, when the remote and
local both have the content. It used to say:
move foo ok
Now:
move foo (from theremote...) ok

That was done so that, when move --to=here is used and the content is
locally present and also in several remotes, it's clear which remotes the
content gets dropped from.

Note that move --to=here will report an error if a non-reachable remote
contains the file, even if the local repository also contains the file. I
think that's reasonable; the user may be intending to move all other copies
of the file from remotes.

OTOH, if a copy of the file is believed to be present in some repository
that is not a configured remote, move --to=here does not report an error.
So a little bit inconsistent, but erroring in this case feels wrong.

copy --to=here came along for free, but it's basically the same behavior as
git-annex get, and probably with not as good messages in edge cases
(especially on failure), so I've not documented it.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-05-31 17:00:18 -04:00
Joey Hess
5ee6912cf3
support parsing options like --to=here
Reworked remote name parsing to allow things like that. Command.Move
uses it for --to=here, although there's not yet an implementation of
that option.

This commit was sponsored by Ignacio on Patreon.
2017-05-31 16:49:28 -04:00
Joey Hess
c8e1e3dada
AssociatedFile newtype
To prevent any further mistakes like 301aff34c4

This commit was sponsored by Francois Marier on Patreon.
2017-03-10 13:35:31 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
8dcf79694d
enable forwardRetry for command-line transfers
If a transfer fails for some reason, but some data managed to be sent, the
transfer will be retried. (The assistant already did this.)

Possible impacts:

* More ssh prompts if ssh needs to prompt for a password to connect to a
  host, or is prompting about some other problem like a ssh key mismatch.

* More data transfer due to retrying, epecially when a remote does not
  support resuming a transfer.

  In the worst case, a lot of data will be transferred but it fails before
  the end, and then all that data gets transferred again plus one byte more;
  repeat until it manages to get the whole file.
2016-10-26 15:38:27 -04:00
Joey Hess
3e22d60549
copy, move, mirror: Support --json and --json-progress. 2016-09-09 16:24:26 -04:00
Joey Hess
10ddf2c3bd
remove TransferObserver
unused after last commit
2016-08-03 13:46:20 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00
Joey Hess
d13194b230
--branch, stage 2
Show branch:file that is being operated on.

I had to make ActionItem a type and not a type class because
withKeyOptions' passed two different types of values when using the type
class, and I could not get the type checker to accept that.
2016-07-20 15:23:43 -04:00
Joey Hess
74e01a2d01
move --to: Better behavior when system is completely out of disk space; drop content from disk before writing location log.
I noticed move --to failing when there was no disk space. The file was sent
to the remote, but it crashed before it could be dropped locally. This
could fix that.
2016-06-05 13:51:22 -04:00
Joey Hess
737e45156e
remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Joey Hess
c0c595345c
arrange for regional output manager to run when -J is enabled
Commands that want to use it have to run their seek action inside
allowConcurrentOutput. Which seems reasonable; perhaps some future command
will want to support the -J flag but not use regions.

The region state moved from Annex to MessageState. This makes sense
organizationally, and note that some uses of onLocal use a different Annex
state, but pass the MessageState into it, which is what is needed.
2015-11-04 16:22:43 -04:00
Joey Hess
e392ec112f
also generate a drop safety proof for move --from remote 2015-10-09 16:16:03 -04:00
Joey Hess
6a72045707
fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal 2015-10-09 15:48:02 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
2fb3722ce9 Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 15:56:39 -04:00
Joey Hess
6a4f2087be finished converting all the main options 2015-07-10 13:23:06 -04:00
Joey Hess
a7f58634b8 wip 2015-07-09 16:05:45 -04:00
Joey Hess
8ad927dbc6 converted copy and move
Got a little tricky..
2015-07-09 15:23:14 -04:00
Joey Hess
6e5c1f8db3 convert all commands to work with optparse-applicative
Still no options though.
2015-07-08 15:08:02 -04:00
Joey Hess
a2ba701056 started converting to use optparse-applicative
This is a work in progress. It compiles and is able to do basic command
dispatch, including git autocorrection, while using optparse-applicative
for the core commandline parsing.

* Many commands are temporarily disabled before conversion.
* Options are not wired in yet.
* cmdnorepo actions don't work yet.

Also, removed the [Command] list, which was only used in one place.
2015-07-08 13:36:25 -04:00
Joey Hess
61ccf95004 Avoid accumulating transfer failure log files unless the assistant is being used.
Only the assistant uses these, and only the assistant cleans them up, so
make only git annex transferkeys write them,

There is one behavior change from this. If glacier is being used, and a
manual git annex get --from glacier fails because the file isn't available
yet, the assistant will no longer later see that failed transfer file and
retry the get. Hope no-one depended on that old behavior.
2015-05-12 15:53:38 -04:00
Joey Hess
8077ccbd54 get, move, copy, mirror: Concurrent downloads and uploads are now supported!
This works, and seems fairly robust. Clean get of 20 files at -J3. At -J10,
there are some messages about ssh multiplexing, probably due to a race
spinning up the ssh connection cacher. But, it manages to get all the files
ok regardless.

The progress bars are a scrambled mess though, due to bugs in
ascii-progress, which I've already filed. Particularly this one:
https://github.com/yamadapc/haskell-ascii-progress/issues/8
2015-04-10 17:08:07 -04:00
Joey Hess
cd6b62f35e --auto is no longer a global option; only get, drop, and copy accept it.
Not a behavior change unless you were passing it to a command that ignored it.
2015-03-25 17:06:14 -04:00
Joey Hess
8066a1c3cc The file matching options are now only accepted by commands that can actually use them. 2015-02-06 17:16:41 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
59f88558d5 doh't use "def" for command definitions, it conflicts with Data.Default.def 2014-10-14 14:20:10 -04:00
Joey Hess
b61c6bc2ff hlint 2014-10-09 15:46:05 -04:00
Joey Hess
aebcc395ff use types to enforce that removeAnnex can only be called inside lockContent
This fixed one bug where it needed to be and wasn't (in Assistant.Unused).
And also found one place where lockContent was used unnecessarily (by
drop --from remote).

A few other places like uninit probably don't really need to lockContent,
but it doesn't hurt to do call it anyway.

This commit was sponsored by David Wagner.
2014-08-20 20:13:47 -04:00
Joey Hess
e386e26ef2 avoid trying to create a content file in order to lock it
The nice refactoring in ec7dd0446a
highlighted a bug in lockContent -- when the content is not present,
this incorrectly created an empty lock file, using the same filename
as the content file.

This seems like it could result in empty objects, which fsck would detect
and complain about. Both drop and move --to call lockContent, as does
Remote.Git.dropKey -- I think we got lucky and this bug didn't show up
because both all of those only operate on files that are present. So
this bug could only manifest if there was a race, and a file's content
was dropped at just the wrong time, just as another process was about to
drop it. (And then only if the other process's dropping failed, otherwise
it'd delete the empty object file.)

Hmm, move --from also called lockContent. Unnecessarily, since the content
is not being removed from the local annex. In this case, the combination of
the 2 bugs could result in an empty lock file being written, and then if
the download of the content failed, left in the object directory as the
content.

This commit also optimises lockContent, avoiding an unncessary
doesFileExist test and instead just catching the exception that's thrown
when the file doesn't exist.

This commit was sponsored by Justine Lam.
2014-08-20 17:25:30 -04:00