Commit graph

84 commits

Author SHA1 Message Date
Joey Hess
d266a41f8d
prevent numcopies or mincopies being configured to 0
Ignore annex.numcopies set to 0 in gitattributes or git config, or by
git-annex numcopies or by --numcopies, since that configuration would make
git-annex easily lose data. Same for mincopies.

This is a continuation of the work to make data only be able to be lost
when --force is used. It earlier led to the --trust option being disabled,
and similar reasoning applies here.

Most numcopies configs had docs that strongly discouraged setting it to 0
anyway. And I can't imagine a use case for setting to 0. Not that there
might not be one, but it's just so far from the intended use case of
git-annex, of managing and storing your data, that it does not seem like
it makes sense to cater to such a hypothetical use case, where any
git-annex drop can lose your data at any time.

Using a smart constructor makes sure every place avoids 0. Note that this
does mean that NumCopies is for the configured desired values, and not the
actual existing number of copies, which of course can be 0. The name
configuredNumCopies is used to make that clear.

Sponsored-by: Brock Spratlen on Patreon
2022-03-28 15:20:34 -04:00
Joey Hess
771a122c9e
add --size-limit option
When this option is not used, there should be effectively no added
overhead, thanks to the optimisation in
b3cd0cc6ba.

When an action fails on a file, the size of the file still counts toward
the size limit. This was necessary to support concurrency, but also
generally seems like the right choice.

Most commands that operate on annexed files support the option.
export and import do not, and I don't know if it would make sense for
export to.. Why would you want an incomplete export? sync doesn't, and
while it would be easy to make it support it for transferring files,
it's not clear if dropping files should also take the size limit into
account. Commands like add that don't operate on annexed files don't
support the option either.

Exiting 101 not yet implemented.

Sponsored-by: Denis Dzyubenko on Patreon
2021-06-04 16:16:53 -04:00
Joey Hess
b5f5475ed6
New matching options --excludesamecontent and --includesamecontent
The normalisation of filenames turns out to be the tricky part here,
because the associated files coming out of the keys db may look like
"./foo/bar" or "../bar". For the former to match a glob like "foo/*",
it needs to be normalised.

Note that, on windows, normalise "./foo/bar" = "foo\\bar"
which a glob like "foo/*" won't match. So the glob is matched a second
time, on the toInternalGitPath, so allowing the user to provide a glob
with the slashes in either direction. However, this still won't support
some wacky edge cases like the user providing a glob of "foo/bar\\*"

Sponsored-by: Dartmouth College's Datalad project
2021-05-25 13:08:18 -04:00
Joey Hess
0e830b6bb5
make remoteKeyToRemoteName safer
If it's passed a ConfigKey such as annex.version, avoid returning
an empty remote name and return Nothing instead. Also, foo.bar.baz is
not treated as a remote named "bar".
2021-04-23 13:29:21 -04:00
Joey Hess
d16d739ce2
implement fastDebug
Most of the changes here involve global option parsing: GlobalSetter
changed so it can both run an Annex action to set state, but can also
change the AnnexRead value, which is immutable once the Annex monad is
running.

That allowed a debugselector value to be added to AnnexRead, seeded
from the git config. The --debugfilter option's GlobalSetter then updates
the AnnexRead.

This improved GlobalSetter can later be used to move more stuff to
AnnexRead. Things that don't involve a git config will be easier to
move, and probably a *lot* of things can be moved eventually.

fastDebug, while implemented, is not used anywhere yet. But it should be
fast..
2021-04-06 15:24:28 -04:00
Joey Hess
c8b1fa67b4
Behavior change: --trust-glacier option no longer overrides trust
Since that can lead to data loss, which should never be enabled by an
option other than --force.

This commit was sponsored by Jake Vosloo on Patreon.
2021-01-07 10:37:43 -04:00
Joey Hess
2bf34fc17f
Behavior change: --trust option no longer overrides trust
Since that can lead to data loss, which should never be enabled by an
option other than --force.

I suppose that using --trust was in some situation, safer than --force,
because it doesn't entirely disable checking for data loss, but only
disables checking involving data that is on the specified repository.
But it seems better to be able to say that data loss only happens with
--force.

This commit was sponsored by Graham Spencer on Patreon.
2021-01-07 10:34:57 -04:00
Joey Hess
cc89699457
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.

Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.

This commit was sponsored by Denis Dzyubenko on Patreon.
2021-01-06 14:15:19 -04:00
Joey Hess
a3a19518d8
fix --time-limit
It got broken in several ways by the streaming seeking optimisations
around version 8.20201007.

Moved time limit checking out of the matcher, which was a hack in the
first place. So everywhere that uses Limit.getMatcher needs to check
time limit. Well, almost everywhere. Command.Info uses it, but it does
not make sense to time limit getting info. And Command.MultiCast uses it
just to build up a list of files that then get passed to a command, so
it would never have hit the timeout in a useful way.

This implementation is a little more expensive when at time limit than
necessary, since it continues seeking only to discard everything after the
time limit. I did try making it close the file handles to force a faster
shutdown, but that didn't work and hung. Could certianly be improved
somehow, but seeking is probably not the expensive bit when a time limit
is hit, so this seems acceptable for now.
2021-01-04 15:57:11 -04:00
Joey Hess
7036d0a4c1
add, import: Fix a reversion in 7.20191009 that broke handling of --largerthan and --smallerthan
This commit was sponsored by Jochen Bartl on Patreon.
2020-10-19 15:36:18 -04:00
Joey Hess
77c42782d0
differentiate between concurrency enabled at command line and by git config
The latter should not affect --batch mode.
2020-09-16 11:47:12 -04:00
Joey Hess
4c58433c48
avoid using MonadFail in ParseDuration
There's no instance for Either String, so that makes it not as useful as
it could be, so instead just return an Either String.
2020-08-15 15:53:35 -04:00
Joey Hess
f75be32166
external backends wip
It's able to start them up, the only thing not implemented is generating
and verifying keys. And, the key translation for HasExt.
2020-07-29 15:23:18 -04:00
Joey Hess
f912f8e5fd
refix bug in a better way
Always run Git.Config.store, so when the git config gets reloaded,
the override gets re-added to it, and changeGitRepo then calls extractGitConfig
on it and sees the annex.* settings from the override.

Remove any prior occurance of -c v and add it to the end. This way,
-c foo=1 -c foo=2 -c foo=1 will pass -c foo=1 to git, rather than -c foo=2

Note that, if git had some multiline config that got built up by
multiple -c's, this would not work still. But it never worked because
before the bug got fixed in the first place, the -c value was repeated
many times, so the multivalue thing would have been wrong. I don't think
-c can be used with multiline configs anyway, though git-config does
talk about them?
2020-07-02 13:32:33 -04:00
Joey Hess
ec0f8a6e74
Fix reversion that broke passing git configs with -c
Reverting commit c8fec6ab0
2020-07-02 12:42:13 -04:00
Joey Hess
cee6b344b4
cat-file resource pool
Avoid running a large number of git cat-file child processes when run with
a large -J value.

This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
2020-04-20 15:19:31 -04:00
Joey Hess
ca9c6c5f60
Fix a potential failure to parse git config
Git has an obnoxious special case in git config, a line "foo" is the same
as "foo = true". That means there is no way to examine the output of
git config and tell if it was run with --null or not, since a "foo"
in the first line could be such a boolean, or could be followed by its
value on the next line if --null were used.

So, rather than trying to do such a detection, track the style of config
at all the points where it's generated.
2020-04-13 13:05:41 -04:00
Joey Hess
c8fec6ab03
Fix a minor bug that caused options provided with -c to be passed multiple times to git. 2020-03-16 13:06:44 -04:00
Joey Hess
f6d629e483
changelog and minor style 2020-02-28 12:57:55 -04:00
Peter Simons
73cf523a4b
Fix build with ghc-8.8.x.
The 'fail' method has been moved to the 'MonadFail' class. I made the changes
so that the code still compiles with previous versions of 'base' that don't
have the new MonadFail class exported by Prelude yet.
2020-02-28 12:54:20 -04:00
Joey Hess
7d9dff5b05
Merge branch 'master' into bs
and update changelog
2019-12-18 15:13:30 -04:00
Joey Hess
7fd5376334
inprogress: Support --key 2019-12-18 14:14:16 -04:00
Joey Hess
d7833def66
use ByteString for git config
The parser and looking up config keys in the map should both be faster
due to using ByteString.

I had hoped this would speed up startup time, but any improvement to
that was too small to measure. Seems worth keeping though.

Note that the parser breaks up the ByteString, but a config map ends up
pointing to the config as read, which is retained in memory until every
value from it is no longer used. This can change memory usage
patterns marginally, but won't affect git-annex.
2019-11-27 17:40:09 -04:00
Joey Hess
61b384d2b7
add --sameas option, not yet used 2019-10-01 12:36:25 -04:00
Joey Hess
2b55a2b882
remotedaemon: Don't list --stop in help since it's not supported.
Also, move out of plumbing section. When using tor, the remotedaemon is
part of the user's workflow, as it runs the tor hidden service.
2019-09-30 14:40:46 -04:00
Joey Hess
b13a350556
added --unlocked and --locked 2019-09-19 12:33:13 -04:00
Joey Hess
fda1bdd679
Added --mimetype and --mimeencoding file matching options.
Already had these for largefiles matching, but I forgot to add them as
command-line options.
2019-09-19 12:09:59 -04:00
Joey Hess
9a5ddda511
remove many old version ifdefs
Drop support for building with ghc older than 8.4.4, and with older
versions of serveral haskell libraries than will be included in Debian 10.

The only remaining version ifdefs in the entire code base are now a couple
for aws!

This commit should only be merged after the Debian 10 release.
And perhaps it will need to wait longer than that; it would make
backporting new versions of  git-annex to Debian 9 (stretch) which
has been actively happening as recently as this year.

This commit was sponsored by Ilya Shlyakhter.
2019-07-05 15:09:37 -04:00
Joey Hess
aa7710982b
avoid list lookup by parseToken
Minor optimisation to parsing of a preferred content expression.
2019-05-14 13:11:29 -04:00
Joey Hess
fa070df373
fix usage 2019-05-10 14:52:52 -04:00
Joey Hess
82186ca58f
annex.jobs=cpus etc
Added the ability to run one job per CPU (core), by setting annex.jobs=cpus,
or using option --jobs=cpus or -Jcpus.

Built with future expansion in mind, including not defaulting matching on
Concurrency so more constructors can later be added, and using "cpu"
instead of "0".
2019-05-10 13:27:08 -04:00
Joey Hess
40ecf58d4b
update licenses from GPL to AGPL
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.

Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.

(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
2019-03-13 15:48:14 -04:00
Joey Hess
303e828b7c
rest of the deserializeKey renameing 2019-01-14 13:17:47 -04:00
Joey Hess
727767e1e2
make everything build again after ByteString Key changes 2019-01-11 16:39:46 -04:00
Joey Hess
904be4e6be
add --branch option to git-annex find and mildly deprecate findref in favor of it
No deprecation warning at run time, just one on the man page.

One thing findref remains able to do that find cannot is to run in a bare
repo. Find was made to refuse to run in a bare repo because it seemed
confusing for it to not list any files ever in that situation. It would be
better for find --branch to work in a bare repo but not without --branch
but I don't currently have a way to do that.

Probably a better solution would be to make git-annex in a bare repo
default to --branch master or something like that instead of --all.

This commit was sponsored by Denis Dzyubenko on Patreon.
2018-12-09 14:10:37 -04:00
Joey Hess
029ae8d4db
support findred and --branch with file matching options
* findref: Support file matching options: --include, --exclude,
  --want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin
* Commands supporting --branch now apply file matching options --include,
  --exclude, --want-get, --want-drop to filenames from the branch.
  Previously, combining --branch with those would fail to match anything.
* add, import, findref: Support --time-limit.

This commit was sponsored by Jake Vosloo on Patreon.
2018-12-09 13:38:35 -04:00
Joey Hess
6ba3dea566
annex.jobs
Added annex.jobs setting, which is like using the -J option.

Of course, -J overrides annex.jobs.

This commit was sponsored by Trenton Cronholm on Patreon.
2018-10-04 12:47:27 -04:00
Joey Hess
6e6c9cc6d3
Added --accessedwithin matching option.
Useful for dropping old objects from cache repositories.

But also, quite a genrally useful thing to have..

Rather than imitiating find's -atime and other options, all of which are
pretty horrible to use, I made this match files accessed within a time
period, using the same duration format used by git-annex schedule and
--limit-time

In passing, changed the --limit-time option parser to parse the
duration, instead of having it later throw an error.

This commit was supported by the NSF-funded DataLad project.
2018-08-01 15:34:03 -04:00
Joey Hess
f5a5886307
squash build warning with optparse-applicative-0.14.1
It exported some stuff that used to be only in .Internal, IIRC done at
my request..
2018-04-22 13:41:24 -04:00
Joey Hess
0106752db2
refactor FromToHereOptions 2018-04-09 14:29:28 -04:00
Joey Hess
6583448bab
add --json-error-messages (not yet implemented)
Added --json-error-messages option, which includes error messages in the
json output, rather than outputting them to stderr.

The actual rediretion of errors is not implemented yet, this is only
the docs and option plumbing.

This commit was supported by the NSF-funded DataLad project.
2018-02-19 14:32:15 -04:00
Joey Hess
fa65f1d240
fix --json-progress --json to be same as --json --json-progress
Fix behavior of --json-progress followed by --json, in which
the latter option disabled the former.

This commit was supported by the NSF-funded DataLad project.
2018-02-19 14:12:15 -04:00
Joey Hess
2b66492d6e
Improve startup time for commands that do not operate on remotes
And for tab completion, by not unnessessarily statting paths to remotes,
which used to cause eg, spin-up of removable drives.

Got rid of the remotes member of Git.Repo. This was a bit painful.

Remote.Git modifies the list of remotes as it reads their configs,
so still need a persistent list of remotes. So, put it in as
Annex.gitremotes. It's only populated by getGitRemotes, so commands
like examinekey that don't care about remotes won't do so.

This commit was sponsored by Jake Vosloo on Patreon.
2018-01-09 16:22:07 -04:00
Joey Hess
5cf7216774
zsh and fish completions
optparse-applicative-0.14.0.0 adds support for these, so have the
Makefile install their scripts when built with it.

CmdLine/GitAnnex/Options.hs now uses action "file" in cmdParams,
which affects the bash and zsh completions, letting them complete
filenames for subcommands that use that. This is not needed for
bash, since bash-completion.bash enables -o bashdefault, which
lets it complete filenames too. But it does not seem to break the bash
completions. It is needed for zsh; the zsh completion otherwise
does not complete filenames. The fish completion will always complete
filenames no matter what. Messy.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-06-09 11:38:20 -04:00
Joey Hess
5ee6912cf3
support parsing options like --to=here
Reworked remote name parsing to allow things like that. Command.Move
uses it for --to=here, although there's not yet an implementation of
that option.

This commit was sponsored by Ignacio on Patreon.
2017-05-31 16:49:28 -04:00
Joey Hess
49114cf4ea
securehash matching
Added --securehash option to match files using a secure hash function, and
corresponding securehash preferred content expression.

This commit was sponsored by Ethan Aubin.
2017-02-27 15:02:44 -04:00
Joey Hess
9c4650358c
add KeyVariety type
Where before the "name" of a key and a backend was a string, this makes
it a concrete data type.

This is groundwork for allowing some varieties of keys to be disabled
in file2key, so git-annex won't use them at all.

Benchmarks ran in my big repo:

old git-annex info:

real	0m3.338s
user	0m3.124s
sys	0m0.244s

new git-annex info:

real	0m3.216s
user	0m3.024s
sys	0m0.220s

new git-annex find:

real	0m7.138s
user	0m6.924s
sys	0m0.252s

old git-annex find:

real	0m7.433s
user	0m7.240s
sys	0m0.232s

Surprising result; I'd have expected it to be slower since it now parses
all the key varieties. But, the parser is very simple and perhaps
sharing KeyVarieties uses less memory or something like that.

This commit was supported by the NSF-funded DataLad project.
2017-02-24 15:16:56 -04:00
Joey Hess
280442ca2c
Remove -j short option for --json-progress; that option was already taken for --json.
This commit was sponsored by Trenton Cronholm.
2017-01-30 12:46:42 -04:00
Joey Hess
05d4438383
addurl, get: Added --json-progress option, which adds progress objects to the json output.
This doesn't work right when used with -J yet, and there is some really
ugly hand-crafting of part of the json output.
2016-09-09 15:06:54 -04:00
Joey Hess
8ef494a833
disentangle concurrency and message type
This makes -Jn work with --json and --quiet, where before
setting -Jn disabled those options.

Concurrent json output is currently a mess though since threads output
chunks over top of one-another.
2016-09-09 12:57:42 -04:00