Ensure that checkCanAdd is used everywhere a file is added to git,
so git add is run with -f, presumably avoiding the work it would usually
do to check ignores.
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.
Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.
Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.
This commit was sponsored by Noam Kremen on Patreon.
Import small files into git, the same as is done when importing with content.
Which means, for small files, --no-content does download them.
If the largefiles expression needs the file content available
(due to mimetype or mimeencoding being used), the import will fail.
This commit was sponsored by Jake Vosloo on Patreon.
Sped up seeking for files to operate on, when using options like --copies
or --in, by around 20%.
Benchmark showed an increase for --copies from 155 seconds to 121
seconds, and --in remote will be similar to that.
For --in here, the speedup was less, 5-10% or so.
(both warm cache)
This commit was sponsored by Jack Hill on Patreon.
Sped up seeking to around twice as fast, by avoiding a pass over the
worktree files when preferred content expressions of the local repo and
remotes don't use include=/exclude=.
Thanks to Lukey for identifying the optimisation.
This commit was sponsored by Brock Spratlen on Patreon.
matchNeedsFileContent is not used yet, but shows how to add information
about terminals. That one would be needed for
https://git-annex.branchable.com/todo/sync_fast_import/
Note the tricky bit in Annex.FileMatcher.call where it folds over the
included matcher to propagate the information.
This commit was sponsored by Svenne Krap on Patreon.
getPid returns Nothing if the process has already been stopped, and in that
case, the pid will not be displayed. I think that would only happen if
waitForProcess or similar gets called more than once on the same process
handle though.
getPid on unix has an overhead of only a MVar read. On Windows it needs to
make a syscall, so will be probably more expensive. While the added expense
happens even when debug logging is disabled, it should be small enough
compared with the overhead of starting a process that it's not a problem.
(It does occur to me that a debugM that took an IO String could only run it
when debugging is really enabled, which would improve performance. It does
not seem possible to use the current hslogger interface to do that though;
it does not expose the information that would be needed.)
With some hints for the user for what to do.
Took care to avoid changing the json output. It would have been ok to add
the new separated lists to it, in addition to the old list, but I didn't
do that because I didn't see much point.
Also tested what happens if the other special remote has importtree=yes
and exporttree=yes, and in that case, download via httpalso works too,
without needing to implement any importtree methods here.
It might be possible to make it automatically set exporttree=yes if the
--sameas does. Didn't try, will probably be layering issues.
Or perhaps it should be inherited by sameas like some
other configs? But then, wouldn't it also make sense to inherit
importree=yes? But as shown here, it's not needed by this kind of
remote.
"http" was too generic and easy to confuse with web. The new name makes
clear it's used in addition to some other remote. And other protocols
can use the same naming scheme.