Commit graph

38201 commits

Author SHA1 Message Date
Joey Hess
41271e4eb4
avoid git check-ignore overhead on importing known files
isKnownImportLocation does a database lookup and there's an index
to make that lookup fast, so it's probably faster than talking to git
check-ignore. Checking the matcher is faster still.

While before the gitignore check was added it did not need to always
check isknown, now it does, because it's that or the more expensive
notignored. But at least we can skip notignored when a file is known,
which will often be the common case: Importing from a remote that's been
exported to, and/or imported from before, only new files will not be
known, so only those will need to check notignored.

At first, I had this:
	(matches <&&> (isknown <||> notignored)) <||> isknown
Notice that checks isknown every time, whether it matches or not.

So, it's no slower to instead do this:
	isknown <||> (matches <&&> notignored)
That has the benefit that, when it's known, it doesn't need to run
matches, which while faster than isknown, is still going to use some CPU.

And it perhaps more clearly expresses the condition: Any known file is
wanted, otherwise it's down to what matches and is not ignored.

This commit was sponsored by Jack Hill on Patren.
2020-09-30 11:20:44 -04:00
Joey Hess
c56efbbdb6
import: Check gitignores when importing trees from special remotes
It seemed best to do this, for consistency with every other way files can
get into a git-annex repo. Although it's just a bit strange that a local
.gitignore file affects the pseudo-commits made for the remote that's
imported from.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-09-30 10:41:59 -04:00
Joey Hess
0033e08193
avoid a second traversal of the ImportableContents
Do all filtering in one pass.
2020-09-30 10:10:03 -04:00
Joey Hess
a9128d4b45
typo 2020-09-30 09:40:06 -04:00
Joey Hess
4c32499e82
Parse youtube-dl progress output
Which lets progress be displayed when doing concurrent downloads.
Amoung other things, like --json-progress etc.

The youtube-dl output is no longer displayed, except for any errors.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-29 17:53:48 -04:00
Joey Hess
4c7335caf3
add a pointer for git-annex repos over http
So users are less likely to try to use httpalso for them, which is not
its purpose.
2020-09-29 13:59:51 -04:00
Joey Hess
084b502c7a
httpalso: Support being used with special remotes that do not have encryption= in their config. 2020-09-29 13:56:27 -04:00
Joey Hess
59263d2c6f
add import 2020-09-29 13:51:51 -04:00
Joey Hess
b2cf284d2a
upgrade: Avoid an upgrade failure of a bare repo in unusual circumstances 2020-09-29 13:45:14 -04:00
Joey Hess
aa06c69a6f
close 2020-09-29 13:26:48 -04:00
Joey Hess
8615d4b4b3
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-29 13:25:31 -04:00
Joey Hess
1610d94776
addurl: Avoid a redundant git ignores check for speed
Ensure that checkCanAdd is used everywhere a file is added to git,
so git add is run with -f, presumably avoiding the work it would usually
do to check ignores.
2020-09-29 13:00:41 -04:00
rto@914936d87a43105f8c5df5ae4787140e7eb5d846
42bd4f1110 2020-09-29 16:51:49 +00:00
Joey Hess
d10cbaa084
comment 2020-09-29 12:25:40 -04:00
Joey Hess
c80d71fad1
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-29 12:12:25 -04:00
Joey Hess
dc274a6804
fix inverted logic in recent commit 2020-09-29 12:11:50 -04:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
a3083436dc removed 2020-09-29 08:36:37 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
7b37f22880 Added a comment 2020-09-29 08:36:19 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
ff0784dad9 Added a comment 2020-09-29 08:32:22 +00:00
mark@6b90344cdab3158eacb94a3944460d138afc9bef
26842956b5 Added a comment 2020-09-28 21:40:23 +00:00
mark@6b90344cdab3158eacb94a3944460d138afc9bef
a8e4539d15 Added a comment 2020-09-28 21:10:58 +00:00
mark@6b90344cdab3158eacb94a3944460d138afc9bef
32cd9a44ca 2020-09-28 21:09:27 +00:00
yarikoptic
67144b1c46 Added a comment 2020-09-28 19:44:08 +00:00
Joey Hess
e44a132085
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-28 15:30:23 -04:00
Joey Hess
658ea7ca3c
sync --no-content import from directory special remote
sync: When run without --content, import without copying from
importtree=yes directory special remotes. (Other special remotes may
support this later as well.)

This commit was sponsored by Svenne Krap on Patreon.
2020-09-28 15:29:08 -04:00
Joey Hess
3eaaec3113
consistently use importKey when available
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.

Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.

Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.

This commit was sponsored by Noam Kremen on Patreon.
2020-09-28 15:27:46 -04:00
Joey Hess
15c1ee16d9
import --no-content: Check annex.largefiles
Import small files into git, the same as is done when importing with content.
Which means, for small files, --no-content does download them.

If the largefiles expression needs the file content available
(due to mimetype or mimeencoding being used), the import will fail.

This commit was sponsored by Jake Vosloo on Patreon.
2020-09-28 13:28:57 -04:00
Joey Hess
8b74f01a26
split ProvidedInfo and UserProvidedInfo
The latter is for git-annex matchexpression and matching against it can
throw an exception. Splitting out the former reduces the potential for
mistakes and avoids needing to worry about matching against that
throwing an exception.

This is more groundwork for matching largefiles while importing,
without downloading content.

This commit was sponsored by Graham Spencer on Patreon.
2020-09-28 12:12:38 -04:00
Joey Hess
00dbe35fbc
allow matching on files whose content is not present
Anything that needs to examine the file content will fail to match,
or fall back to other available information. But the intent is that the
matcher be checked for matchNeedsFileContent and only be used if it does
not, so the exact behavior doesn't much matter as it should never
happen.

The real point of this is to not need to provide a dummy content file
when matching.

This commit was sponsored by Martin D on Patreon.
2020-09-28 11:17:46 -04:00
Joey Hess
9e676f062f
split out todo 2020-09-28 10:40:13 -04:00
Joey Hess
1aec0fc6b9
close as unreproducible 2020-09-28 10:13:15 -04:00
Joey Hess
933097b327
moreinfo 2020-09-28 10:11:24 -04:00
Joey Hess
f324cfa9e7
close 2020-09-28 10:04:38 -04:00
mhauru
5aa85fa092 2020-09-27 14:03:12 +00:00
yarikoptic
9651c1da96 Added a comment 2020-09-27 13:42:18 +00:00
mhauru
20e1afc804 Added a comment: NeuroDebian seems to have stopped updating 2020-09-27 12:40:57 +00:00
Joey Hess
6a41a615b9
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-25 13:51:20 -04:00
Joey Hess
13f9c88123
add todo 2020-09-25 13:51:04 -04:00
Lukey
12eb7a3ceb Added a comment 2020-09-25 16:33:42 +00:00
Joey Hess
3e577a6dd3
remove reapZombies
Believed to be no longer needed as I've squashed the last ones.

Note that, in Test.Framework, I can see no reason for the code to have
run it twice. It does not cause running processes to exit after all,
so any process that has leaked and is running and causing problems with
cleanup of the directory won't be helped by running it.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-09-25 11:50:38 -04:00
Joey Hess
f624876dc2
remove zombie process in file seeking
This was the last one marked as a zombie. There might be others I don't
know about, but except for in the hypothetical case of a thread dying
due to an async exception before it can wait on a process it started, I
don't know of any.

It would probably be safe to remove the reapZombies now, but let's wait
and so that in its own commit in case it turns out to cause problems.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-09-25 11:38:42 -04:00
Joey Hess
5117ae8aec
fix build warning 2020-09-25 11:07:41 -04:00
Joey Hess
ca454c47f2
explicitly wait for a git process
Eliminate a zombie that was only cleaned up by the later zombie cleanup
code.

This is still not ideal, it would be cleaner if it used conduit or
something, and if the thread gets killed before waiting, it won't stop
the process.

Only remaining zombies are in CmdLine.Seek
2020-09-25 11:03:12 -04:00
Joey Hess
b5b1aeacba
devblog (for yesterday, forgot to add) 2020-09-25 10:56:07 -04:00
Joey Hess
d81f549385
fix some compile warnings left in yesterday
at least 2 could have caused a crash in some circumstances

This commit was sponsored by Brett Eisenberg on Patreon.
2020-09-25 10:55:39 -04:00
Joey Hess
ace02f41b0
seek: defer matcher check until more info is known
Sped up seeking for files to operate on, when using options like --copies
or --in, by around 20%.

Benchmark showed an increase for --copies from 155 seconds to 121
seconds, and --in remote will be similar to that.

For --in here, the speedup was less, 5-10% or so.

(both warm cache)

This commit was sponsored by Jack Hill on Patreon.
2020-09-24 17:59:12 -04:00
Joey Hess
c2d1d4e16e
close this
phibs who was seeing hangs confirmed they're gone on irc
2020-09-24 16:51:30 -04:00
Joey Hess
b7afcda887
fix some matchNeedsFileName values
matchMagic: Always False for MatchingKey. Unsure why.. Could be a bug?

limitUnused: Behaves differently when there is a filename.

limitSize: When used with LimitDiskFiles, checks the size on disk of the
filename.
2020-09-24 16:08:47 -04:00
Joey Hess
051e16a945
remove debug print 2020-09-24 15:37:39 -04:00
Joey Hess
b3af8a40f3
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-24 15:13:33 -04:00