Commit graph

402 commits

Author SHA1 Message Date
Joey Hess
114a2d7504
Fix display when run with -J1
Commit b6642dde8a broke it by enabling
non-concurrent display mode while leaving concurrency set in the config
and having already started concurrency earlier.

(I don't actually know if that commit was a good idea.)

Sponsored-By: Brett Eisenberg on Patreon
2023-06-15 10:07:54 -04:00
Joey Hess
c4ad9b1446
Fix bug in -z handling of trailing NUL in input
The obvious way to fix this would be to adapt lines to split on null.

However, it's actually nontrivial to rewrite lines. In particular it has a
weird implementation to avoid a space leak. See:
https://gitlab.haskell.org/ghc/ghc/-/issues/4334

Also, while that is a small amount of code, it's covered by a rather
complex copyright and I'd have to include that copyright in git-annex.

So, I opted to filter out the trailing empty string instead.

Sponsored-by: Dartmouth College's Datalad project
2023-05-19 14:34:02 -04:00
Joey Hess
e955912ad0
git-annex assist
assist: New command, which is the same as git-annex sync but with
new files added and content transferred by default.

(Also this fixes another reversion in git-annex sync,
--commit --no-commit, and --message were not enabled, oops.)

See added comment for why git-annex assist does commit staged
changes elsewhere in the work tree, but only adds files under
the cwd.

Note that it does not support --no-commit, --no-push, --no-pull
like sync does. My thinking is, why should it? If you want that
level of control, use git commit, git annex push, git annex pull.
Sync only got those options because pull and push were not split
out.

Sponsored-by: k0ld on Patreon
2023-05-18 14:37:43 -04:00
Joey Hess
31d21578c3
improve description of --debugfilter 2023-05-17 12:38:22 -04:00
Joey Hess
5df89d58c7
git-annex pull and push
Split out two new commands, git-annex pull and git-annex push. Those plus a
git commit are equivilant to git-annex sync.

In a sense, git-annex sync conflates 3 things, and it would have been
better to have push and pull from the beginning and not sync. Although
note that git-annex sync --content is faster than a pull followed by a
push, because it only has to walk the tree once, look at preferred
content once, etc. So there is some value in git-annex sync in speed, as
well as user convenience.

And it would be hard to split out pull and push from sync, as far as the
implementaton goes. The implementation inside sync was easy, just adjust
SyncOptions so it does the right thing.

Note that the new commands default to syncing content, unless
annex.synccontent is explicitly set to false. I'd like sync to also do
that, but that's a hard transition to make. As a start to that
transition, I added a note to git-annex-sync.mdwn that it may start to
do so in a future version of git-annex. But a real transition would
necessarily involve displaying warnings when sync is used without
--content, and time.

Sponsored-by: Kevin Mueller on Patreon
2023-05-16 16:51:07 -04:00
Joey Hess
271f3b1ab4
uninit: Support --json and --json-error-messages
Had to convert uninit to do everything that can error out inside a
CommandStart. This was harder than feels nice.

(Also, in passing, converted CommandCheck to use a data type, not a
weird number that it was not clear how it managed to be unique.)

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-05-11 13:43:02 -04:00
Joey Hess
be36e208c2
json object for FileNotFound
When a nonexistant file is passed to a command and  --json-error-messages
is enabled, output a JSON object indicating the problem.

(But git ls-files --error-unmatch still displays errors about such files in
some situations.)

I don't like the duplication of the name of the command introduced by this,
but I can't see a great way around it. One way would be to pass the Command
instead.

When json is not enabled, the stderr is unchanged. This is necessary
because some commands like find have custom output. So dislaying
"find foo not found" would be wrong. So had to complicate things with
toplevelFileProblem having different output with and without json.

When not using --json-error-messages but still using --json, it displays
the error to stderr, but does display a json object without the error. It
does have an errorid though. Unsure how useful that behavior is.

Sponsored-by: Dartmouth College's Datalad project
2023-04-25 19:26:20 -04:00
Joey Hess
91ba0cc7fd
Revert "--json-exceptions"
This reverts commit a325524454.

Turns out this was predicated on an incorrect belief that json output
didn't already sometimes lack the "key" field. Since json output already
can when `giveup` was used, it seems unncessary to add a whole new
option for this.
2023-04-25 17:37:34 -04:00
Joey Hess
a325524454
--json-exceptions
Added a --json-exceptions option, which makes some exceptions be output in json.

The distinction is that --json-error-messages is for messages relating
to a particular ActionItem, while --json-exceptions is for messages that
are not, eg ones for a file that does not exist.

It's unfortunate that we need two switches with such a fine distinction
between them, but I'm worried about maintaining backwards compatability
in the json output, to avoid breaking anything that parses it, and this was
the way to make sure I didn't.

toplevelWarning is generally used for the latter kind of message. And
the other calls to toplevelWarning could be converted to showException. The
only possible gotcha is that if toplevelWarning is ever called after
starting acting on a file, it will add to the --json-error-messages of the
json displayed for that file and converting to showException would be a
behavior change. That seems unlikely, but I didn't convery everything to
avoid needing to satisfy myself it was not a concern.

Sponsored-by: Dartmouth College's Datalad project
2023-04-25 17:05:33 -04:00
Joey Hess
9155ed1072
configremote
New command, currently limited to changing autoenable= setting of a special remote.

It will probably never be used for more than that given the limitations on
it.

Sponsored-by: Brock Spratlen on Patreon
2023-04-18 15:30:49 -04:00
Joey Hess
3290a09a70
filter out control characters in warning messages
Converted warning and similar to use StringContainingQuotedPath. Most
warnings are static strings, some do refer to filepaths that need to be
quoted, and others don't need quoting.

Note that, since quote filters out control characters of even
UnquotedString, this makes all warnings safe, even when an attacker
sneaks in a control character in some other way.

When json is being output, no quoting is done, since json gets its own
quoting.

This does, as a side effect, make warning messages in json output not
be indented. The indentation is only needed to offset warning messages
underneath the display of the file they apply to, so that's ok.

Sponsored-by: Brett Eisenberg on Patreon
2023-04-10 15:55:44 -04:00
Joey Hess
cd544e548b
filter out control characters in error messages
giveup changed to filter out control characters. (It is too low level to
make it use StringContainingQuotedPath.)

error still does not, but it should only be used for internal errors,
where the message is not attacker-controlled.

Changed a lot of existing error to giveup when it is not strictly an
internal error.

Of course, other exceptions can still be thrown, either by code in
git-annex, or a library, that include some attacker-controlled value.
This does not guard against those.

Sponsored-by: Noam Kremen on Patreon
2023-04-10 13:50:51 -04:00
Joey Hess
2b940f7725
registerurl, unregisterurl: Added --remote option
This serves two purposes. --remote=web bypasses other special remotes that
claim the url, same as addurl --raw. And, specifying some other remote
allows making sure that an url is claimed by the remote you expect,
which makes then using setpresentkey not be fragile.

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-04-05 15:54:41 -04:00
Yaroslav Halchenko
84b0a3707a
Apply codespell -w throughout 2023-03-17 15:14:58 -04:00
Joey Hess
54ad1b4cfb
Windows: Support long filenames in more (possibly all) of the code
Works around this bug in unix-compat:
https://github.com/jacobstanley/unix-compat/issues/56
getFileStatus and other FilePath using functions in unix-compat do not do
UNC conversion on Windows.

Made Utility.RawFilePath use convertToWindowsNativeNamespace to do the
necessary conversion on windows to support long filenames.

Audited all imports of System.PosixCompat.Files to make sure that no
functions that operate on FilePath were imported from it. Instead, use
the equvilants from Utility.RawFilePath. In particular the
re-export of that module in Common had to be removed, which led to lots
of other changes throughout the code.

The changes to Build.Configure, Build.DesktopFile, and Build.TestConfig
make Utility.Directory not be needed to build setup. And so let it use
Utility.RawFilePath, which depends on unix, which cannot be in
setup-depends.

Sponsored-by: Dartmouth College's Datalad project
2023-03-01 15:55:58 -04:00
Joey Hess
c209e0f643
add FIELD?=GLOB to git-annex view usage
And also to vadd usage.

Also added some other things to the usage that were omitted before to
save space.

Adding even FIELD?=GLOB made the git-annex --help list of commands grow
too wide for an 80 column display. So, removed the description of
parameters from that list of commands.

Sponsored-By: Brock Spratlen on Patreon
2023-02-07 18:09:10 -04:00
Joey Hess
a6c1d9752b
move/copy: option parsing for --from with --to
Allowing --from and --to as an alternative to --from or --to
is hard to do with optparse-applicative!

The obvious approach of (pfrom <|> pto <|> pfromandto) does not work
when pfromandto uses the same option names as pfrom and pto do.
It compiles but the generated parser does not work for all desired
combinations.

Instead, have to parse optionally from and optionally to. When neither
is provided, the parser succeeds, but it's a result that can't be
handled. So, have to giveup after option parsing. There does not seem to
be a way to make an optparse-applicative Parser give up internally
either.

Also, need seek' because I first tried making fto be a where binding,
but that resulted in a hang when git-annex move was run without --from
or --to. I think because startConcurrency was not expecting the stages
value to contain an exception and so ended up blocking.

Sponsored-by: Dartmouth College's DANDI project
2023-01-18 14:42:39 -04:00
Joey Hess
f8bc208e89
findkeys: New command, very similar to git-annex find but operating on keys
I've long been asked for `git-annex find --all` or something like that,
but pushed back on it because I feel that the command is analagous to
find(1) and so it would be surprising for it to list keys rather than
files. So instead, add a new findkeys subcommand.

Note that the use of withKeyOptions is rather strange because usually
that is used to fall back to --all rather than listing files, but here
it's made to default to --all like behavior and never list files.

A performance thing that could be improved is that withKeyOptions
always reads and caches location logs. But findkeys with no options does
not need them, so it could be made faster. That caching does speed up
options like --in though. This is really just a subset of a more general
performance thing that --all reads location logs sometimes unncessarily.
Anyway, it needs to read the location log in order to checkDead,
and it seems good that findkeys does skip dead keys.

Also, cleaned up comments on git-annex-find man page asking for --all
option.

Sponsored-by: Dartmouth College's DANDI project
2023-01-17 14:51:57 -04:00
Joey Hess
a522a41a42
fix misleading helper function name
noworktreeitems was false for NoWorkTreeItems which was hard to understand.

Sponsored-by: Dartmouth College's DANDI project
2023-01-17 14:25:38 -04:00
Joey Hess
0b2dd374d8
--anything and --nothing
Added --anything (and --nothing). Eg, git-annex find --anything will list
all annexed files whether or not the content is present. This is slightly
faster and clearer than --include=* or --exclude=*

While I can't imagine how --nothing will be used, preferred content
expressions already had anything and nothing, so might as well support both
as matching options as well.

Sponsored-by: Dartmouth College's Datalad project
2022-12-20 15:44:09 -04:00
Joey Hess
731e806c96
use lookupKeyStaged in --batch code paths
Make --batch mode handle unstaged annexed files consistently whether the
file is unlocked or not. Before this, a unstaged locked file
would have the symlink on disk examined and operated on in --batch mode,
while an unstaged unlocked file would be skipped.

Note that, when not in batch mode, unstaged files are skipped over too.
That is actually somewhat new behavior; as late as 7.20191114 a
command like `git-annex whereis .` would operate on unstaged locked
files and skip over unstaged unlocked files. That changed during
optimisation of CmdLine.Seek with apparently little fanfare or notice.

Turns out that rmurl still behaved that way when given an unstaged file
on the command line. It was changed to use lookupKeyStaged to
handle its --batch mode. That also affected its non-batch mode, but
since that's just catching up to the change earlier made to most
other commands, I have not mentioed that in the changelog.

It may be that other uses of lookupKey should also change to
lookupKeyStaged. But it may also be that would slow down some things,
or lead to unwanted behavior changes, so I've kept the changes minimal
for now.

An example of a place where the use of lookupKey is better than
lookupKeyStaged is in Command.AddUrl, where it looks to see if the file
already exists, and adds the url to the file when so. It does not matter
there whether the file is staged or not (when it's locked). The use of
lookupKey in Command.Unused likewise seems good (and faster).

Sponsored-by: Nicholas Golder-Manning on Patreon
2022-10-26 14:43:06 -04:00
Joey Hess
b2ee2496ee
remove whenAnnexed and ifAnnexed
In preparation for adding a new variation on lookupKey.

Sponsored-by: Max Thoursie on Patreon
2022-10-26 14:06:32 -04:00
Joey Hess
ba7ecbc6a9
avoid flushing keys db queue after each Annex action
The flush was only done Annex.run' to make sure that the queue was flushed
before git-annex exits. But, doing it there means that as soon as one
change gets queued, it gets flushed soon after, which contributes to
excessive writes to the database, slowing git-annex down.
(This does not yet speed git-annex up, but it is a stepping stone to
doing so.)

Database queues do not autoflush when garbage collected, so have to
be flushed explicitly. I don't think it's possible to make them
autoflush (except perhaps if git-annex sqitched to using ResourceT..).
The comment in Database.Keys.closeDb used to be accurate, since the
automatic flushing did mean that all writes reached the database even
when closeDb was not called. But now, closeDb or flushDb needs to be
called before stopping using an Annex state. So, removed that comment.

In Remote.Git, change to using quiesce everywhere that it used to use
stopCoProcesses. This means that uses on onLocal in there are just as
slow as before. I considered only calling closeDb on the local git remotes
when git-annex exits. But, the reason that Remote.Git calls stopCoProcesses
in each onLocal is so as not to leave git processes running that have files
open on the remote repo, when it's on removable media. So, it seemed to make
sense to also closeDb after each one, since sqlite may also keep files
open. Although that has not seemed to cause problems with removable
media so far. It was also just easier to quiesce in each onLocal than
once at the end. This does likely leave performance on the floor, so
could be revisited.

In Annex.Content.saveState, there was no reason to close the db,
flushing it is enough.

The rest of the changes are from auditing for Annex.new, and making
sure that quiesce is called, after any action that might possibly need
it.

After that audit, I'm pretty sure that the change to Annex.run' is
safe. The only concern might be that this does let more changes get
queued for write to the db, and if git-annex is interrupted, those will be
lost. But interrupting git-annex can obviously already prevent it from
writing the most recent change to the db, so it must recover from such
lost data... right?

Sponsored-by: Dartmouth College's Datalad project
2022-10-12 14:12:23 -04:00
Joey Hess
85dbc21c1c
fix typo 2022-10-07 12:30:07 -04:00
Joey Hess
70d2ece381
improve usage
These commands operate on not only remotes, but any way a repository can
be specified, including "here" etc.

Sponsored-by: Graham Spencer on Patreon
2022-10-03 13:49:42 -04:00
Joey Hess
2478e9e03a
restage: New git-annex command, handles restaging unlocked files
This is much easier and less failure-prone than having the user run
git update-index --refresh themselves.

Sponsored-by: Dartmouth College's DANDI project
2022-09-23 16:29:59 -04:00
Joey Hess
66bd4f80b3
Improved handling of --time-limit when combined with -J
When concurrency is enabled, there can be worker threads still running
when the time limit is checked. Exiting right there does not
give those threads time to finish what they're doing. Instead, the seeking
is wrapped up, and git-annex then shuts down cleanly.

The whole point of --time-limit existing, rather than using timeout(1)
when running git-annex is to let git-annex finish the action(s) it is
working on when the time limit is reached, and shut down cleanly.

I noticed this problem when investigating why restagePointerFile might
not have run after get/drop of an unlocked file. With --time-limit -J,
a worker thread may have finished updating a work tree file, and be killed
by the time limit check before it can run restagePointerFile. So despite
--time-limit running the shutdown actions, the work tree file didn't get
restaged.

Sponsored-by: Dartmouth College's DANDI project
2022-09-22 12:54:52 -04:00
Joey Hess
eefc026370
fix reversion on skipping dead keys in --all/bare
Fix a reversion that made dead keys not be skipped when operating on all
keys via --all or in a bare repo. (Introduced in version 8.20200720)

Also, improved the documentation of git-annex-dead, it does not only apply
to fsck --all.

Also, made git-annex fsck, when run on a file whose key is dead, display
that. Before, it displayed that only when run with --all, but with this
fix, it skips dead keys with --all. But it can still be run on a file that
uses a dead key, and displaying "This key is dead" explains to the user
why it does not consider missing content for it to be a problem.

Sponsored-by: k0ld on Patreon
2022-09-13 14:38:13 -04:00
Joey Hess
8a4cfd4f2d
use getSymbolicLinkStatus not getFileStatus to avoid crash on broken symlink
Fix crash importing from a directory special remote that contains a broken
symlink.

The crash was in listImportableContentsM but some other places in
Remote.Directory also seemed like they could have the same problem.

Also audited for other places that have such a problem. Not all calls
to getFileStatus are bad, in some cases it's better to crash on something
unexpected. For example, `git-annex import path` when the path is a broken
symlink should crash, the same as when it does not exist. Many of the
getFileStatus calls are like that, particularly when they involve
.git/annex/objects which should never have a broken symlink in it.

Fixed a few other possible cases of the problem.

Sponsored-by: Lawrence Brogan on Patreon
2022-09-05 13:46:32 -04:00
Joey Hess
ed39979ac8
import: Avoid following symbolic links inside directories being imported
Too big a footgun.

This does not prevent attackers who can write to the directory being
imported from racing the check. But they can cause anything to be imported
anyway, so would be limited to getting the legacy import to follow into a
directory they do not write to, and move files out of it into the annex.
(The directory special remote does not have that problem since it does not
move files.)

Sponsored-by: Jack Hill on Patreon
2022-08-19 13:31:16 -04:00
Joey Hess
3a513cfe73
add --dry-run: New option
This is intended for users who want to see what it would output in order to
eg, check if a file would be added to git or the annex. It is not intended
as a way for scripts to get information.

Sponsored-by: Dartmouth College's Datalad project
2022-08-03 11:16:04 -04:00
Joey Hess
be19a68276
new matching options --want-get-by and --want-drop-by
Sponsored-by: Graham Spencer on Patreon
2022-07-28 13:26:03 -04:00
Joey Hess
b223988e22
remove --backend from global options
--backend is no longer a global option, and is only accepted by commands
that actually need it.

Three commands that used to support backend but don't any longer are
watch, webapp, and assistant. It would be possible to make them support it,
but I doubt anyone used the option with these. And in the case of webapp
and assistant, the option was handled inconsistently, only taking affect
when the command is run with an existing git-annex repo, not when it
creates a new one.

Also, renamed GlobalOption etc to AnnexOption. Because there are many
options of this type that are not actually global (any more) and get
added to commands that need them.

Sponsored-by: Kevin Mueller on Patreon
2022-06-29 13:33:25 -04:00
Joey Hess
8040ecf9b8
final readonly values moves to AnnexRead
At this point I've checked all AnnexState values and these were all that
remained that could move.

Pity that Annex.repo can't move, but it gets modified sometimes..

A couple of AnnexState values are set by options and could be AnnexRead,
but happen to use Annex when being set.

Sponsored-by: Max Thoursie on Patreon
2022-06-28 16:04:58 -04:00
Joey Hess
cb9cf30c48
move several readonly values to AnnexRead
This improves performance to a small extent in several places.

Sponsored-by: Tobias Ammann on Patreon
2022-06-28 15:40:19 -04:00
Joey Hess
d266a41f8d
prevent numcopies or mincopies being configured to 0
Ignore annex.numcopies set to 0 in gitattributes or git config, or by
git-annex numcopies or by --numcopies, since that configuration would make
git-annex easily lose data. Same for mincopies.

This is a continuation of the work to make data only be able to be lost
when --force is used. It earlier led to the --trust option being disabled,
and similar reasoning applies here.

Most numcopies configs had docs that strongly discouraged setting it to 0
anyway. And I can't imagine a use case for setting to 0. Not that there
might not be one, but it's just so far from the intended use case of
git-annex, of managing and storing your data, that it does not seem like
it makes sense to cater to such a hypothetical use case, where any
git-annex drop can lose your data at any time.

Using a smart constructor makes sure every place avoids 0. Note that this
does mean that NumCopies is for the configured desired values, and not the
actual existing number of copies, which of course can be 0. The name
configuredNumCopies is used to make that clear.

Sponsored-by: Brock Spratlen on Patreon
2022-03-28 15:20:34 -04:00
Joey Hess
ce91f10132
fix annex.skipunknown false error propagation
Propagate nonzero exit status from git ls-files when a specified file does
not exist, or a specified directory does not contain any files checked into
git.

The recent completion of the annex.skipunknown transition exposed this
bug, that has unfortunately been lurking all along.

It is also possible that git ls-files errors out for some other reason
-- perhaps a permission problem -- and this will also fix error propagation
in such situations.

Sponsored-by: Dartmouth College's Datalad project
2022-02-28 12:54:56 -04:00
Joey Hess
5b373a9dd2
read a consistent amount from pointer file
A few places were reading the max symlink size of a pointer file,
then passing tp parseLinkTargetOrPointer. Which is fine currently, but
to support pointer files with lines of data after the pointer, enough
has to be read that parseLinkTargetOrPointer can be assured of seeing
enough of that data to know if it's correctly formatted.

Sponsored-by: Dartmouth College's Datalad project
2022-02-23 12:52:34 -04:00
Joey Hess
835c50966a
reject batch options combined with non-batch options
Reject combinations of --batch (or --batch-keys) with options like --all or
--key or with filenames.

Most commands ignored the non-batch items when batch mode was enabled.

For some reason, addurl and dropkey both processed first the specified
non-batch items, followed by entering batch mode. Changed them to also
error out, for consistency.

Sponsored-by: Dartmouth College's Datalad project
2022-01-26 13:00:19 -04:00
Joey Hess
7f6b2ca49c
handle overBranchFileContents with read-only unmerged git-annex branches
This makes --all error out in that situation. Which is better than
ignoring information from the branches.

To really handle the branches right, overBranchFileContents would need
to both query all the branches and union merge file contents
(or perhaps not provide any file content), as well as diffing between
branches to find files that are only present in the unmerged branches.
And also, it would need to handle transitions..

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 14:30:51 -04:00
Joey Hess
d9d0fe5fa4
disable precaching git-annex branch when there are unmerged branches in a read-only repo
The way precaching works, it can't merge in information from those
branches efficiently, so just disable it and fall back to
Annex.Branch.get in order to get the correct information.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 14:08:50 -04:00
Joey Hess
68257e9076
add git-annex filter-process
filter-process: New command that can make git add/checkout faster when
there are a lot of unlocked annexed files or non-annexed files, but that
also makes git add of large annexed files slower.

Use it by running: git
config filter.annex.process 'git-annex filter-process'

Fully tested and working, but I have not benchmarked it at all.
And, incremental hashing is not done when git add uses it, so extra work is
done in that case.

Sponsored-by: Mark Reidenbach on Patreon
2021-11-04 15:02:36 -04:00
Joey Hess
afe5d9e137
fix duplicated commands in git-annex-shell usage display 2021-10-11 15:42:27 -04:00
Joey Hess
7bdc7350a5
remove git-annex-shell compat code
* Removed support for accessing git remotes that use versions of
  git-annex older than 6.20180312.
* git-annex-shell: Removed several commands that were only needed to
  support git-annex versions older than 6.20180312.
  (lockcontent, recvkey, sendkey, transferinfo, commit)

The P2P protocol was added in that version, and used ever since, so
this code was only needed for interop with older versions.

"git-annex-shell commit" is used by newer git-annex versions, though
unnecessarily so, because the p2pstdio command makes a single commit at
shutdown. Luckily, it was run with stderr and stdout sent to /dev/null,
and non-zero exit status or other exceptions are caught and ignored. So,
that was able to be removed from git-annex-shell too.

git-annex-shell inannex, recvkey, sendkey, and dropkey are still used by
gcrypt special remotes accessed over ssh, so those had to be kept.
It would probably be possible to convert that to using the P2P protocol,
but it would be another multi-year transition.

Some git-annex-shell fields were able to be removed. I hoped to remove
all of them, and the very concept of them, but unfortunately autoinit
is used by git-annex sync, and gcrypt uses remoteuuid.

The main win here is really in Remote.Git, removing piles of hairy fallback
code.

Sponsored-by: Luke Shumaker
2021-10-11 15:36:51 -04:00
Joey Hess
ab7b5a492c
--batch-keys
New --batch-keys option added to these commands:  get, drop, move, copy, whereis

git-annex-matching-options had to be reworded since some of its options
can be used to match on keys, not only files.

Sponsored-by: Luke Shumaker on Patreon
2021-08-25 14:21:12 -04:00
Joey Hess
fa62c98910
simplify and speed up Utility.FileSystemEncoding
This eliminates the distinction between decodeBS and decodeBS', encodeBS
and encodeBS', etc. The old implementation truncated at NUL, and the
primed versions had to do extra work to avoid that problem. The new
implementation does not truncate at NUL, and is also a lot faster.
(Benchmarked at 2x faster for decodeBS and 3x for encodeBS; more for the
primed versions.)

Note that filepath-bytestring 1.4.2.1.8 contains the same optimisation,
and upgrading to it will speed up to/fromRawFilePath.

AFAIK, nothing relied on the old behavior of truncating at NUL. Some
code used the faster versions in places where I was sure there would not
be a NUL. So this change is unlikely to break anything.

Also, moved s2w8 and w82s out of the module, as they do not involve
filesystem encoding really.

Sponsored-by: Shae Erisson on Patreon
2021-08-11 12:13:31 -04:00
Joey Hess
47d3dccf19
whereused implemented
except --historical

Sponsored-by: Jack Hill on Patreon
2021-07-14 14:27:21 -04:00
Joey Hess
8a13bbedd6
--size-limit exit 101
Sponsored-by: Mark Reidenbach on Patreon
2021-06-04 16:43:47 -04:00
Joey Hess
771a122c9e
add --size-limit option
When this option is not used, there should be effectively no added
overhead, thanks to the optimisation in
b3cd0cc6ba.

When an action fails on a file, the size of the file still counts toward
the size limit. This was necessary to support concurrency, but also
generally seems like the right choice.

Most commands that operate on annexed files support the option.
export and import do not, and I don't know if it would make sense for
export to.. Why would you want an incomplete export? sync doesn't, and
while it would be easy to make it support it for transferring files,
it's not clear if dropping files should also take the size limit into
account. Commands like add that don't operate on annexed files don't
support the option either.

Exiting 101 not yet implemented.

Sponsored-by: Denis Dzyubenko on Patreon
2021-06-04 16:16:53 -04:00
Joey Hess
b3cd0cc6ba
minor optimisation
Avoid a second mvar access.

Sponsored-by: Jochen Bartl on Patreon
2021-06-04 14:57:19 -04:00