Commit graph

1552 commits

Author SHA1 Message Date
Joey Hess
631c8d3e5b
avoid redundant adjusted branch update in sync
sync still does update it if the config would otherwise not, since it
already did.
2020-11-16 15:13:48 -04:00
Joey Hess
805af01562
bug fix
really innefficient but it does solve dropping
2020-11-16 14:57:51 -04:00
Joey Hess
557a6e11a6
avoid spurious blank line when updating adjusted branch
git checkout run with --quiet should have no output
2020-11-16 14:41:38 -04:00
Joey Hess
0896038ba7
annex.adjustedbranchrefresh
Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
2020-11-16 14:27:28 -04:00
Joey Hess
af6af35228
split out Annex.Content.Presence
This will let a module that Annex.Content imports use inAnnex.
Unsure yet if I will need that, but this split still seems to make
sense, and Annex.Content was way too long so splitting it is good.
2020-11-16 11:24:57 -04:00
Joey Hess
ccfa9b2dc4
make sync update --unlock-present branch 2020-11-13 15:04:34 -04:00
Joey Hess
e66b7d2e1b
rename to --unlock-present and better reverse adjusting
An --unlock-present branch reverses back to a branch where
all files that get modified or renamed become locked, even if they were
originally unlocked. This is the same that reversing a --unlock branch
works, and the new name makes that commonality more clear.
2020-11-13 14:56:43 -04:00
Joey Hess
c8e49c5ef5
git-annex adjust --lock-missing
Like --hide-missing the branch does not get updated when content
availability changes.

Seems to basically work, but sync does not update it yet.

Also, when a file is present and so unlocked, git mv followed by
git-annex sync results in the basis branch being updated to contain the
file with the new name, unlocked. This seems different than what
happens in an adjusted unlocked branch, where the commit propigates back
locked. Probably the reverse adjustment code needs to be improved to
handle this case.
2020-11-13 13:39:44 -04:00
Joey Hess
b1eb47599a
move old direct mode stuff out of Annex.Locations 2020-11-12 12:40:35 -04:00
Joey Hess
92b7b1964d
add warning on add of annex link
Warn when adding a annex symlink or pointer file that uses a key that is
not known to the repository, to prevent confusion if the user has copied it
from some other repository.

This commit was sponsored by Jake Vosloo on Patreon.
2020-11-10 12:10:51 -04:00
Joey Hess
885974be99
add newtypes for QuickCheck to avoid LANG=C issues
All properties changed to use them, except for
prop_encode_c_decode_c_roundtrip, which already filtered to ascii
for other reasons.

A few modules had to be split out, because Setup does not build-depend
on QuickCheck.
2020-11-09 20:21:18 -04:00
Joey Hess
d032b0885d
use MatchingKey when a Key is known
This fixes a bug where a file that was not preferred content could be
transferred to a remote. This happened when the file got deleted after
the sync started running.

The only time checkMatcher is run without a Key is in calls to
checkFileMatcher, which are only done by add, addurl, import, and
smudge --clean. Those won't be affected by this kind of race. Anything
else that might be precaching and have a similar race as sync will also
be fixed, but I don't know if it actually affected anything other than
sync.

As well as fixing a bug, this also probably makes sync and --auto faster
by avoiding the redundant key lookup.

This commit was sponsored by Graham Spencer on Patreon.
2020-11-09 15:17:22 -04:00
Joey Hess
907a0bcad6
avoid providing filename with NUL to quickcheck properties
instance Arbitrary [Char] allows that, and it's not a legal part of a
filename so can break processing them.

Noticed when prop_view_roundtrips failed.

The instance Arbitrary AssociatedFile avoids this problem.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-11-06 15:15:33 -04:00
Joey Hess
1db49497e0
finished this stage of the RawFilePath conversion
This commit was sponsored by Denis Dzyubenko on Patreon.
2020-11-06 14:10:58 -04:00
Joey Hess
2c8cf06e75
more RawFilePath conversion
Converted file mode setting to it, and follow-on changes.

Compiles up through 369/646.

This commit was sponsored by Ethan Aubin.
2020-11-05 18:45:37 -04:00
Joey Hess
9b0dde834e
convert getFileSize to RawFilePath
Lots of nice wins from this in avoiding unncessary work, and I think
nothing got slower.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-11-05 11:32:57 -04:00
Joey Hess
f9fc26f05a
Merge branch 'master' into rawfilepath 2020-11-04 14:21:44 -04:00
Joey Hess
5a1e73617d
finished this stage of the RawFilePath conversion
Finally compiles again, and test suite passes.

This commit was sponsored by Brock Spratlen on Patreon.
2020-11-04 14:20:37 -04:00
Joey Hess
4bcb4030a5
more RawFilePath conversion
580/645

This commit was sponsored by Jack Hill on Patreon.
2020-11-03 18:34:27 -04:00
Joey Hess
78178d4c33
clean build warning 2020-11-03 11:36:48 -04:00
Joey Hess
eb42cd4d46
more RawFilePath conversion
535/645

This commit was sponsored by Brett Eisenberg on Patreon.
2020-11-03 10:11:04 -04:00
Joey Hess
9252f86b2e
view: Fix a reversion in 8.20200522 that broke entering or changing views.
Commit 2dc7b5186a messed up indentation.

This commit was sponsored by Noam Kremen on Patreon.
2020-11-02 14:47:08 -04:00
Joey Hess
c41be0d3bd
Merge branch 'master' into rawfilepath 2020-11-02 14:35:46 -04:00
Joey Hess
7245a9ed53
Improve shutdown process for external special remotes and external backends
Make sure to relay any remaining stderr from the process after it has
shut down, rather than closing stderr just before shutdown. This avoids
a situation where the process is still running and tries to write to
stderr, getting a SIGPIPE. And, it ensures that no stderr output is
lost.

This may fix a problem encountered by datalad on windows, where it hangs
during the external special remote shutdown.

Before commit a49d300545, it closed stdin
and stdout, but left stderr open, and never killed the stderr waiter
thread, which presumably exited on its own. For async exception
safety, do need to at make sure that thread gets waited on, as that
commit does, but it introduced this problem.

Note that, the process's stdout is closed before waiting on it. It's too
late for anything it writes to stdout to be processed, and since we're
not going to consume any such writes, this avoids the process getting
blocked writing to stdout due to us not reading what it's buffered. This
does mean that if the process writes to stdout too late, it will get a
SIGPIPE. (This was already the case before the above-mentioned commit.)
In practice, I think only the protocol's ERROR is allowed to be
sent at a point where this could happen.
2020-11-02 12:56:35 -04:00
Joey Hess
87f91ce563
more RawFilePath conversion
451/645
2020-10-30 15:55:59 -04:00
Joey Hess
b4b02e4c61
more RawFilePath conversion
412/645
2020-10-30 13:31:35 -04:00
Joey Hess
ca80c3154c
more RawFilePath conversion
removeFile changed to removeLink, because AFAICS it should be fine to
remove non-file things here. In particular, it's fine to remove a
symlink, since we're about to write a symlink. (removeLink does not
remove directories, so file, symlink, and unix socket are the only
possibilities.)
2020-10-30 13:07:41 -04:00
Joey Hess
681b44236a
more RawFilePath conversion
at 377/645

This commit was sponsored by Svenne Krap on Patreon.
2020-10-29 14:20:57 -04:00
Joey Hess
f45ad178cb
more RawFilePath conversion
At 318/645 after 4k lines of changes

This commit was sponsored by Jake Vosloo on Patreon.
2020-10-29 12:03:50 -04:00
Joey Hess
e505c03bcc
more RawFilePath conversion
nukeFile replaced with removeWhenExistsWith removeLink, which allows
using RawFilePath. Utility.Directory cannot use RawFilePath since setup
does not depend on posix.

This commit was sponsored by Graham Spencer on Patreon.
2020-10-29 10:50:29 -04:00
Joey Hess
8d66f7ba0f
more RawFilePath conversion
Added a RawFilePath createDirectory and kept making stuff build.

Up to 296/645

This commit was sponsored by Mark Reidenbach on Patreon.
2020-10-28 17:25:59 -04:00
Joey Hess
b8bd2e45e3
more RawFilePath conversion
Notable wins in Annex.Locations which was sometimes doing 6 conversions
in a single function call.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-10-28 16:24:14 -04:00
Joey Hess
6c29817748
RawFilePath version of getCurrentDirectory
This commit was sponsored by Jochen Bartl on Patreon
2020-10-28 16:03:45 -04:00
Joey Hess
64e7bac810
view: Avoid using ':' from metadata when generating a view
Because it's a special character on Windows ("c:").

Use same technique already used for '/' and '\'.

I didn't record how I generated their encoded forms before, so am sure
there was a better way, but the way I did it now is to look at

	ghci> encodeFilePath "∕"
	"\226\136\149"

And then the difference from that to "\56546\56456\56469"
is adding 56320 to each, to get up to the escaped code plane.

See comment for why I think handling ':' is ok, but that other illegal
windows filenames won't. Note that, this should be enough to make the
test suite always work. Other windows illegal filenames will fail at
checkout time when it tries to put the illegal filename on the
filesystem.
2020-10-26 15:38:08 -04:00
Joey Hess
0133b7e5a8
move: Improve resuming a move that was interrupted after the object was transferred
In cases where numcopies checks prevented the resumed move from dropping
the object from the source repository, it now relies on a log of recent
moves to replicate the behavior of the interrupted command.

Performance: Probably noticable impact, since it has to add to the log,
check the log, and remove from the log. Seems worth it to avoid this
annoying edge case. The log functions are pretty well optimised to avoid
unncessary work.

An performance improvement to make later would be to avoid cleanup doing
anything if it's not written to the log file, and has confirmed that the
log file does not contain the log line.

This commit was sponsored by Jake Vosloo on Patreon.
2020-10-21 10:31:56 -04:00
Joey Hess
62d630272e
improve name 2020-10-20 15:06:55 -04:00
Joey Hess
7036d0a4c1
add, import: Fix a reversion in 7.20191009 that broke handling of --largerthan and --smallerthan
This commit was sponsored by Jochen Bartl on Patreon.
2020-10-19 15:36:18 -04:00
Joey Hess
c3e5417c17
don't try to remove pre-commit-annex and post-update-annex-hooks
Those are not installed by git-annex but by the user, and so removal
will never find the default content, and so if the user did install
them, it would display a misleading message.

Seems better, since the user installed them, to let the user remove them
if they want to.
2020-10-19 13:13:49 -04:00
Joey Hess
20f86e43f7
Fix a build failure on Windows. 2020-10-07 12:04:54 -04:00
Joey Hess
41271e4eb4
avoid git check-ignore overhead on importing known files
isKnownImportLocation does a database lookup and there's an index
to make that lookup fast, so it's probably faster than talking to git
check-ignore. Checking the matcher is faster still.

While before the gitignore check was added it did not need to always
check isknown, now it does, because it's that or the more expensive
notignored. But at least we can skip notignored when a file is known,
which will often be the common case: Importing from a remote that's been
exported to, and/or imported from before, only new files will not be
known, so only those will need to check notignored.

At first, I had this:
	(matches <&&> (isknown <||> notignored)) <||> isknown
Notice that checks isknown every time, whether it matches or not.

So, it's no slower to instead do this:
	isknown <||> (matches <&&> notignored)
That has the benefit that, when it's known, it doesn't need to run
matches, which while faster than isknown, is still going to use some CPU.

And it perhaps more clearly expresses the condition: Any known file is
wanted, otherwise it's down to what matches and is not ignored.

This commit was sponsored by Jack Hill on Patren.
2020-09-30 11:20:44 -04:00
Joey Hess
c56efbbdb6
import: Check gitignores when importing trees from special remotes
It seemed best to do this, for consistency with every other way files can
get into a git-annex repo. Although it's just a bit strange that a local
.gitignore file affects the pseudo-commits made for the remote that's
imported from.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-09-30 10:41:59 -04:00
Joey Hess
0033e08193
avoid a second traversal of the ImportableContents
Do all filtering in one pass.
2020-09-30 10:10:03 -04:00
Joey Hess
4c32499e82
Parse youtube-dl progress output
Which lets progress be displayed when doing concurrent downloads.
Amoung other things, like --json-progress etc.

The youtube-dl output is no longer displayed, except for any errors.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-29 17:53:48 -04:00
Joey Hess
dc274a6804
fix inverted logic in recent commit 2020-09-29 12:11:50 -04:00
Joey Hess
658ea7ca3c
sync --no-content import from directory special remote
sync: When run without --content, import without copying from
importtree=yes directory special remotes. (Other special remotes may
support this later as well.)

This commit was sponsored by Svenne Krap on Patreon.
2020-09-28 15:29:08 -04:00
Joey Hess
3eaaec3113
consistently use importKey when available
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.

Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.

Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.

This commit was sponsored by Noam Kremen on Patreon.
2020-09-28 15:27:46 -04:00
Joey Hess
15c1ee16d9
import --no-content: Check annex.largefiles
Import small files into git, the same as is done when importing with content.
Which means, for small files, --no-content does download them.

If the largefiles expression needs the file content available
(due to mimetype or mimeencoding being used), the import will fail.

This commit was sponsored by Jake Vosloo on Patreon.
2020-09-28 13:28:57 -04:00
Joey Hess
8b74f01a26
split ProvidedInfo and UserProvidedInfo
The latter is for git-annex matchexpression and matching against it can
throw an exception. Splitting out the former reduces the potential for
mistakes and avoids needing to worry about matching against that
throwing an exception.

This is more groundwork for matching largefiles while importing,
without downloading content.

This commit was sponsored by Graham Spencer on Patreon.
2020-09-28 12:12:38 -04:00
Joey Hess
00dbe35fbc
allow matching on files whose content is not present
Anything that needs to examine the file content will fail to match,
or fall back to other available information. But the intent is that the
matcher be checked for matchNeedsFileContent and only be used if it does
not, so the exact behavior doesn't much matter as it should never
happen.

The real point of this is to not need to provide a dummy content file
when matching.

This commit was sponsored by Martin D on Patreon.
2020-09-28 11:17:46 -04:00
Joey Hess
3e577a6dd3
remove reapZombies
Believed to be no longer needed as I've squashed the last ones.

Note that, in Test.Framework, I can see no reason for the code to have
run it twice. It does not cause running processes to exit after all,
so any process that has leaked and is running and causing problems with
cleanup of the directory won't be helped by running it.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-09-25 11:50:38 -04:00