Commit graph

1012 commits

Author SHA1 Message Date
Joey Hess
922621301a
Serialize use of C magic library, which is not thread safe.
This fixes failures uploading to S3 when using -J.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-17 17:27:42 -04:00
Joey Hess
83df401d93
Merge branch 'batchasync' into master 2020-09-16 13:02:58 -04:00
Joey Hess
877ef84a1b
support --batch -J
--batch combined with -J now runs batch requests concurrently for many
commands. Before, the combination was accepted, but did not enable
concurrency. Since the output of batch requests can be in any order, --json
with the new "input" field is recommended to be used, to determine which
batch request each response corresponds to.

If --json is not used, batch mode still runs concurrently, using the usual
concurrent-output. That will not be very useful for most batch mode users,
probably, but who knows.

If a program was using --batch -J before, and was parsing non-json output,
this could break it. But, it was relying on git-annex not supporting
concurrency despite it being enabled, so it should have expected concurrent
output. So, I think that's ok.

annex.jobs does not enable concurrency in --batch mode, because that would
confuse programs that use --batch but don't expect concurrency.
2020-09-16 12:10:37 -04:00
Joey Hess
fcf5d11c63
add "input" field to json output
The use case of this field is mostly to support -J combined with --json.
When that is implemented, a user will be able to look at the field to
determine which of the requests they have sent it corresponds to.

The field typically has a single value in its list, but in some cases
mutliple values (eg 2 command-line params) are combined together and the
list will have more.

Note that json parsing was already non-strict, so old git-annex metadata
--json --batch can be fed json produced by the new git-annex and will
not stumble over the new field.
2020-09-15 16:22:44 -04:00
Joey Hess
3a05d53761
add SeekInput (not yet used)
No behavior changes (hopefully), just adding SeekInput and plumbing it
through to the JSON display code for later use.

Over the course of 2 grueling days.

withFilesNotInGit reimplemented in terms of seekHelper
should be the only possible behavior change. It seems to test as
behaving the same.

Note that seekHelper dummies up the SeekInput in the case where
segmentPaths' gives up on sorting the expanded paths because there are
too many input paths. When SeekInput later gets exposed as a json field,
that will result in it being a little bit wrong in the case where
100 or more paths are passed to a git-annex command. I think this is a
subtle enough problem to not matter. If it does turn out to be a
problem, fixing it would require splitting up the input
parameters into groups of < 100, which would make git ls-files run
perhaps more than is necessary. May want to revisit this, because that
fix seems fairly low-impact.
2020-09-15 15:41:13 -04:00
Joey Hess
5844a54869
aws-0.22 improved its support for setting etags, which improves support for versioned S3 buckets.
Remove placeholder version number I used when implementing the feature in
aws.

This commit was sponsored by Ethan Aubin.
2020-09-14 18:37:49 -04:00
Joey Hess
1a785d05c0
releasing package git-annex version 8.20200908 2020-09-08 14:20:47 -04:00
Joey Hess
dcaa1c1cc9
reorder 2020-09-08 12:54:17 -04:00
Joey Hess
6ea511beb4
Removed the S3 and WebDAV build flags
So these special remotes are always supported.

IIRC these build flags were added because the dep chains were a bit too
long, or perhaps because the libraries were not available in Debian stable,
or something like that. That was long ago, those reasons no longer apply,
and users get confused when builtin special remotes are not available, so
it seems best to remove the build flags now.

If this does cause a problem it can be reverted of course..

This commit was sponsored by Jochen Bartl on Patreon.
2020-09-08 12:42:59 -04:00
Joey Hess
62372ee052
resolvemerge: Improve cleanup of cruft left in the working tree by a conflicted merge
This commit was sponsored by Jake Vosloo on Patreon.
2020-09-07 16:50:27 -04:00
Joey Hess
d120c73302
sync, assistant: When merge.directoryRenames is not set, default it it to "false"
Works better with automatic merge conflict resolution than git's ususual
default of "conflict".

This is not done when automatic merge conflict resolution is disabled.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-09-07 13:50:58 -04:00
Joey Hess
69053a93a2
resolvemerge: Improve cleanup of files that were deleted by one side of a conflicted merge, and modified by the other side
This case was handled by cleanConflictCruft, but only when the annexed
file's object was present. When not present, it left the annexed file
with the original name, not checked into git, while adding the variant
file. So, add an explicit deletion of the deleted file in this case.

My specific case where this happened actually involves
merge.directoryRenames=conflict. After a merge involving that,
the situation was the file appears as "added by them", because that
caused the file that they added to be moved into a directory we renamed.

That case is the same as them adding a modified version of the file,
while we deleted it. (Except for the history of the file, since it's a
new file, but this doesn't look at history.)

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-09-07 12:25:57 -04:00
Joey Hess
e36bae74da
Exposed annex.forward-retry git config
One reason is, 5 is an arbitrary number so ought to be configurable.

The real reason though, is I wanted to make the man page explain when
forward retry can override annex.retry, and having a config made the
man page easier to write.
2020-09-04 15:16:40 -04:00
Joey Hess
2bb933eb60
import: Retry downloads that fail
Also, using the transfer machinery for this makes eg, git-annex info show
in-progress imports, and makes --notify-start/finish work.
2020-09-04 13:54:05 -04:00
Joey Hess
46eb48d7c0
Retry transfers to exporttree=yes remotes same as for other remotes
The comment about noRetry is not well-justified, because transfers to many
remotes cannot be resumed, but retries are still allowed for those.
2020-09-04 13:24:08 -04:00
Joey Hess
1d244bafbd
Limit retrying of failed transfers when forward progress is being made to 5
To avoid some unusual edge cases where too much retrying could result in
far more data transfer than makes sense.
2020-09-04 12:46:37 -04:00
Joey Hess
6e9a4f50f3
make viaTmp honor umask
Fixed several cases where files were created without file mode bits that
the umask would usually set. This included exports to the directory special
remote, torrent files used by the bittorrent special remote, hooks written
by git-annex init, and some log files in .git/annex/

Audited all calls, looking for ones that didn't want the umask bits to be
set. All such turned out to already set the specific restrictive file mode
they wanted.
2020-09-02 14:54:07 -04:00
Joey Hess
8656afd3e1
rename http special remote to httpalso
"http" was too generic and easy to confuse with web. The new name makes
clear it's used in addition to some other remote. And other protocols
can use the same naming scheme.
2020-09-02 10:41:53 -04:00
Joey Hess
571ec900ac
Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https.
With automatic layout learning!
2020-09-01 15:16:35 -04:00
Joey Hess
41ebed3941
Support git remotes where .git is a file, not a directory
Eg when --separate-git-dir was used, and core.symlinks=false.

This commit was sponsored by Brock Spratlen on Patreon.
2020-08-28 15:08:14 -04:00
Joey Hess
cde3e5eb0c
test: Stop gpg-agent daemons that are started for the test framework's gpg key
They normally shutdown when the GNUPGHOME directory is deleted, but on
NFS they keep the directory from being deleted. And also, this avoids
a number of them piling up while the test suite is running.
2020-08-28 14:28:42 -04:00
Joey Hess
b68f214312
Display a message when git-annex has to wait for a pid lock file held by another process 2020-08-26 13:05:34 -04:00
Joey Hess
7bdb0cdc0d
add gitAnnexChildProcess and use instead of incorrect use of runsGitAnnexChildProcess
Fixes reversion in 8.20200617 that made annex.pidlock being enabled result
in some commands stalling, particularly those needing to autoinit.

Renamed runsGitAnnexChildProcess to make clearer where it should be
used.

Arguably, it would be better to have a way to make any process git-annex
runs have the env var set. But then it would need to take the pid lock
when running any and all processes, and that would be a problem when
git-annex runs two processes concurrently. So, I'm left doing it ad-hoc
in places where git-annex really does run a child process, directly
or indirectly via a particular git command.
2020-08-25 14:57:49 -04:00
Joey Hess
6b0532e532
wording 2020-08-25 14:47:17 -04:00
Joey Hess
2ca1ff62dc
addurl --file youtube-dl reversion fix
addurl: Fix reversion in 7.20190322 that made --file not be honored when
youtube-dl was used to download media.

8758f9c561 was on the right track, but missed that | otherwise prevented
the code it added from being used.

Also, refactored out a common function.

This commit was sponsored by Graham Spencer on Patreon.
2020-08-25 12:56:45 -04:00
Joey Hess
27329f0bb1
stack.yaml: Updated to lts-16.10
Needs stack version 2.3 to build, which has only recently made it into
debian unstable.

This commit was sponsored by Jake Vosloo on Patreon.
2020-08-24 14:11:37 -04:00
Joey Hess
f241a3cd3d
Display warning when external special remote does not start up properly, or is not usable
I'm sure this used to work, but somewhere along the line something or
things (getCost and getAvailability I think, probably others)
started catching the exception and not displaying it. So, show warnings.
2020-08-14 15:38:31 -04:00
Joey Hess
05b2b46a82
async extension done 2020-08-14 15:24:34 -04:00
Joey Hess
020e588262
reorder 2020-08-10 16:18:35 -04:00
Joey Hess
bcbdada8bf
fixed 2020-08-10 13:12:55 -04:00
Joey Hess
506ffea5e6
stop symlink check once the top of the working tree is reached
Avoid complaining that a file with "is beyond a symbolic link" when the
filepath is absolute and the symlink in question is not actually inside the
git repository.

This assumes that inodes remain stable while the command is running.
I think they always will, the filesystems where they are unstable change
them across mounts. (If inodes were not stable, it would just complain about
symlinks in the path that are not inside the working tree.)

(On windows, I don't want to assume anything about inodes, they could be
random numbers for all I know. But if they were, this would still be ok, as
long as windows doesn't have symlinks that are detected by isSymbolicLink.
Which seems a fair bet.)
2020-08-06 20:14:30 -04:00
Joey Hess
283d2f85d1
importfeed: Fix reversion that caused some '.' in filenames to be replaced with '_'
sanitizeFilePath was changed to sanitize leading '.', but ImportFeed was
running it on parts of the template. So eg the leading '.' in the extension
got sanitized.

Note the added case for sanitizeLeadingFilePathCharacter ('/':_)
-- this was added because, if the template is title/episode and the title
is not set, it would expand to "/episode". So this is another potential
security fix.
2020-08-05 11:35:00 -04:00
Joey Hess
c4ec52b9ae
Slightly sped up the linux standalone bundle
Reduce the number of directories listed in libdirs, which makes the linker
check a lot less dead ends looking for directories.

Eliminated some directories that didn't really contain shared libraries,
or only contained the linker.

That left only 2, one in lib and one in usr/lib, so consolidate those two.

Doing it this way, rather than just consolidating all libs that might exist
into a single directory means that, if there are optimised versions of some
libs, eg in lib/subarch/foo.so, and lib/subarch2/foo.so, they don't get
moved around in a way that would make the linker pick the wrong one.
2020-07-31 14:42:03 -04:00
Joey Hess
049807dbba
external backends implemented 2020-07-29 17:24:34 -04:00
Joey Hess
00c5f04f20
Deal with unusual IFS settings in the shell scripts for linux standalone and OSX app.
Thanks, Yaroslav Halchenko
2020-07-24 14:46:50 -04:00
Joey Hess
79187a6eaf
Revert "Unset IFS in shell scripts in the linux standalone build and OSX app."
This reverts commit 24125e8dc4.

yoh has a better patch I see
2020-07-24 14:33:13 -04:00
Joey Hess
24125e8dc4
Unset IFS in shell scripts in the linux standalone build and OSX app. 2020-07-24 14:31:11 -04:00
Joey Hess
c5ea2e9d12
better benchmark for move/copy speedup 2020-07-24 13:34:12 -04:00
Joey Hess
18f1fb5841
drop performance improvements
Sped up seeking files to drop by 2x, and also some performance
improvements to checking numcopies.

Interestingly, the seek speedup is not due to precaching, but I think is
due to calling getParsed earlier.

Annex.Drop had to be changed to check inAnnex there, since it was removed
from Command.Drop. All other users of Command.Drop already checked inAnnex
themselves.

This commit was sponsored by Ryan Newton on Patreon.
2020-07-24 13:27:46 -04:00
Joey Hess
d732ef1a89
move, copy: Sped up seeking for annexed files to operate on by a factor of nearly 2x. 2020-07-24 12:56:02 -04:00
Joey Hess
00865cdae8
Fix a bug in find --branch in the previous version
inAnnex check was lost for that code path. To avoid more such mistakes,
made withKeyOptions check it when the AnnexedFileSeeker specifies.
2020-07-24 12:05:28 -04:00
Joey Hess
cb74cefde7
Fix a hang when using git-annex with an old openssh 7.2p2
Which had some weird inheriting of ssh FDs by sshd.

Bug was introduced in git-annex version 7.20200202.7.
2020-07-21 16:14:25 -04:00
Joey Hess
ac56a5c2a0
Fix a lock file descriptor leak that could occur when running commands like git-annex add with -J
Bug was introduced as part of a different FD leak fix in version 6.20160318.
2020-07-21 15:30:47 -04:00
Joey Hess
798fdad660
fix build with dlist-1.0
That removed the list function. This new implementation appears to
actually be more efficient anyway, since it avoids toList.
2020-07-21 12:58:51 -04:00
Joey Hess
1ccb6699a1
guidance on size and mtime fields 2020-07-20 19:56:47 -04:00
Joey Hess
abd56fb019
Fix a bug in find --batch in the previous version. 2020-07-20 19:50:53 -04:00
Joey Hess
af901d1366
releasing package git-annex version 8.20200720 2020-07-20 14:41:12 -04:00
Joey Hess
889603336a
fix reversion in skipping deleted files
And add a test case for that.

This certianly loses some of the 2x performance improvement in file
seeking that seekFilteredKeys led to, because now it has to stat the
worktree files again. Without benchmarking, I expect there will still be
a sizable improvement, and also the git-annex branch precaching that
seekFilteredKeys can do will still be a win of its approach.

Also worth noting that lookupKey, when the file DNE, check if it's in an
adjusted branch with hidden files, and if so, finds the key for the
file anyway. That was intended to make git-annex sync --content be able
to process those files, but a side effect was that, when a file was
deleted but the deletion not yet staged, git-annex commands used to
still list it. That was actually a bug. This commit fixes that bug too.
(git-annex sync --content on such a branch does not use seekFilteredKeys
so was not affected by the reversion or by this behavior change)

This commit was sponsored by Jake Vosloo on Patreon.
2020-07-19 21:25:01 -04:00
Joey Hess
7b2d236556
importfeed: stream metadata for 5% speedup
On top of the 10% speedup from streaming url logs.
2020-07-14 14:35:26 -04:00
Joey Hess
535cdc8d48
importfeed: Made checking known urls step around 10% faster.
This was a bit disappointing, I was hoping for a 2x speedup. But, I think
the metadata lookup is wasting a lot of time and also needs to be made to
stream.

The changes to catObjectStreamLsTree were benchmarked to not also speed
up --all around 3% more. Seems I managed to make it polymorphic after all.
2020-07-14 12:47:51 -04:00