Avoid creating transfer info file before transfer lock is created and
locked.
The wrong order for one thing caused transfer info to be overwritten
when a transfer was already in progress.
But worse, it caused checkTransfer to see the transfer info,
and so lock the transfer lock in order to verify the transfer was not in
progress. Which in a concurrent situation, prevented the transferrer
from locking the transfer lock, so it failed with "transfer already in
progress".
Note that the transferinfo command does not lock the transfer lock
before creating the transfer info. But, that's only run after
recvkey is running, and recvkey does lock the transfer lock, so that
seems more or less ok. (Other than being a super complicated legacy mess
that the P2P code has mostly obsoleted now.)
This commit was supported by the NSF-funded DataLad project.
There are a lot of different variants and sizes, I suppose we might as well
export all the common ones.
Bump dep to cryptonite to 0.16, earlier versions lacked BLAKE2 support.
Even android has 0.16 or newer.
On Debian, Blake2bp_512 is buggy, so I have omitted it for now.
http://bugs.debian.org/892855
This commit was sponsored by andrea rota.
When resuming a download and not using a rolling checksummer like rsync,
the partial file we start with might contain garbage, in the case where a
file changed as it was being downloaded. So, disabling verification on
resumes risked a bad object being put into the annex.
Even downloads with rsync are currently affected. It didn't seem worth the
added complexity to special case those to prevent verification, especially
since git-annex is using rsync less often now.
This commit was sponsored by Brock Spratlen on Patreon.
When git-annex-shell p2pstdio fails with 255, it's because the ssh
server is not reachable. Avoid running the fallback action in this case,
since it would just try a second time to connect, and presumably fail.
Note that the closed P2PSshConnection will not be stored in the pool,
so the next request tries again to connect. This is just the right
behavior; when the remote becomes reachable again, the same git-annex
process will start using it.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Note that, due to not using rsync to transfer files to ssh remotes
any longer, permissions and other file metadata of annexed files
will no longer be preserved when copying them to ssh remotes.
Other remotes never supported preserving that information, so
this is not considered a regression. Added NEWS item about this.
Another significant side effect of this is that, even when rsync is run to
retrieve a file, its progress display will no longer be shown, and
instead the native git-annex progress display will appear. It would be
possible to use the rsync process display when rsync is used (old
git-annex-shell and also retrieval from a local repository), but it
would have complicated the code unncessarily, and been inconsistent
behavior.
(I'd been thinking for a while about eliminating the rsync progress
display, since it's got some annoying verbosities, including display of
the key and the "(xfr#1, to-chk=0/1)" bit and was already somewhat
inconsistent.)
retrieveKeyFileCheap still uses rsync, since that ensures that it gets
the actual file content from the remote. Using the P2P protocol would
use the local content, as long as the local and remote size are the
same.
This commit was sponsored by John Pellman on Patreon.
Remote/Git.hs now contains AGPL licensed code, thus the license
of git-annex as a whole is AGPL. This was already the case when git-annex
was built with the webapp enabled.
The AGPL license will apply to all code added to Remote/Git.hs in the
future, which is going to include support for using
`git-annex-shell p2pstdio`.
Not yet used by git-annex, but this will allow faster transfers etc than
using individual ssh connections and rsync.
Not called git-annex-shell p2p, because git-annex p2p does something
else and I don't want two subcommands with the same name between the two
for sanity reasons.
This commit was sponsored by Øyvind Andersen Holm.
lockContentShared had a screwy caveat that it didn't verify that the content
was present when locking it, but in the most common case, eg indirect mode,
it failed to lock when the content is not present.
That led to a few callers forgetting to check inAnnex when using it,
but the potential data loss was unlikely to be noticed because it only
affected direct mode I think.
Fix data loss bug when the local repository uses direct mode, and a
locally modified file is dropped from a remote repsitory. The bug
caused the modified file to be counted as a copy of the original file.
(This is not a severe bug because in such a situation, dropping
from the remote and then modifying the file is allowed and has the same
end result.)
And, in content locking over tor, when the remote repository is
in direct mode, it neglected to check that the content was actually
present when locking it. This could cause git annex drop to remove
the only copy of a file when it thought the tor remote had a copy.
So, make lockContentShared do its own inAnnex check. This could perhaps
be optimised for direct mode, to avoid the check then, since locking
the content necessarily verifies it exists there, but I have not bothered
with that.
This commit was sponsored by Jeff Goeke-Smith on Patreon.
Do not treat parts of the filename that contain punctuation or other
non-alphanumeric characters as extensions. Before, such characters were
filtered out.
Note that in 45308ec78b "foo.ba__________r"
was munged to ".bar" and so incorrectly treated as an extension. That was
fixed by changing the filter order, but not allowing punctuation seems a
better fix.
This assumes that extensions containing punctuation are rare. "_" seems the
most likely character; I used it in ikiwiki "._comment" files. But I can't
recall seeing it anywhere else. It certianly seems that no commonly used
extensions contain punctuation. If git-annex doesn't treat "._comment"
as an extension, it's not likely to break software that expects to see that
extension like some software expects to see .epub or .mp3.
This commit was sponsored by Jack Hill on Patreon.
Prevent ghc and llc from running out of memory when optimising some
files.
Sean Whitton reported that doing this only in Test.hs was insufficient,
the build still OOMed by the time it got to Test.hs. He had earlier found
the build worked when these options are applied globally.
See https://ghc.haskell.org/trac/ghc/ticket/14821 for why it needs -O1;
once that's fixed it may suffice to use "GHC-Options: -O2 -optlo-O2",
although it may also be that the -O1 prevents ghc from using/leaking
as much memory.
os(arm) should match armel, armhf, armeb, and arm.
It probably also matches arm64, somewhat unfortunately since arm64
systems probably tend to have more memory. See list of arches in
https://hackage.haskell.org/package/Cabal-1.22.2.0/docs/src/Distribution-System.html
This commit was sponsored by Henrik Riomar on Patreon.
Renaming is not supported; it might be possible to use --fuzzy to get rsync
to notice the file is being renamed, but that is a bit ..fuzzy.
On the other hand, interrupted transfers of an exported file are resumed,
since rsync is great at that. Had to adjust the exporttree docs, which
said interrupted transfers would restart.
Note that remove no longer makes the empty directory dummy, instead
sending the top-level empty directory. This works just as well and I
noticed the dummy was unncessary when refactoring it into removeGeneric.
Verified that behavior of remove is not changed, and git annex
testremote does pass.
This commit was sponsored by Brock Spratlen on Patreon.
Makefile: Remove chrpath workaround for bug in cabal, which is no longer
needed.
https://github.com/haskell/cabal/issues/2717 says it uses RUNPATH instead
of RPATH now, but I don't even see that for statically linked libraries;
the bug with that appears to be fixed.
cabal-install version 1.24.0.2
compiled using version 1.24.2.0 of the Cabal library
I left the rpath removal using otool on OSX because those straight up
broke the linker, and I don't know if the OSX autobuilder is updated to
a new enough cabal to not need it.
This commit was sponsored by Ewen McNeill on Patreon.
sync: Fix bug that prevented pulling changes into direct mode repositories
that were committed to remotes using git commit rather than git-annex sync.
This commit was supported by the NSF-funded DataLad project.
tips/automatically_adding_metadata/pre-commit-annex: Fix to not silently
skip filenames containing non-ascii characters.
git diff-index defaults to munging non-ascii characters. Using -z makes
it not do that, and then we just change the nulls to newlines.
This commit was sponsored by Jochen Bartl on Patreon.
Added annex.merge-annex-branches config setting which can be used to
disable automatic merge of git-annex branches.
I wonder if git-annex merge/sync/assistant should disable this
setting? Not sure yet, so have not done so. May be that users will not set
it in git config, but pass it via -c to commands that need it.
Checking the config setting adds a very small overhead, but it's
only checked once per command so should be insignificant.
This commit was supported by the NSF-funded DataLad project.
Noticed while running this (which a user posted in a comment they deleted
for some reason):
git-annex importfeed https://vimeo.com/logiingimars/videos/rss
The filename that youtube-dl suggests included a subdirectory,
which didn't exist, so renaming to it failed.
This commit was sponsored by mo on Patreon.
Repositories that are upgraded from before that version to this
one will not break, but will just not see the benefit of the mergedrefs log
speeding things up, until one new ref gets merged in.
Added --json-error-messages option, which includes error messages in the
json output, rather than outputting them to stderr.
The actual rediretion of errors is not implemented yet, this is only
the docs and option plumbing.
This commit was supported by the NSF-funded DataLad project.
Fix behavior of --json-progress followed by --json, in which
the latter option disabled the former.
This commit was supported by the NSF-funded DataLad project.
The ghc options were found by Sean Whitton; the debian arm autobuilders
need those to build w/o OOM, and it seems to involve llvm using too much
memory to optimize Test.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
--json: When there are multiple lines of notes about a file, make the note
field multiline, rather than the old behavior of only including the last
line.
Using newlines in the note is perhaps not ideal, but upgrading it to an
array in this case would be an annoying inconsistency to need to deal with.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Merged from Debian.
I think what this actually deals with is the case where gpg is installed,
but gpg-agent is not, since Utility.Gpg.stdParams enables --use-agent
when GPG_BATCH is set, and the test suite enables GPG_BATCH. So, test suite
will work with gpg not installed, or with both gpg and gpg-agent installed,
but not with only gpg.
For this reason, I've also put in an explicit dep on gnupg, although
dpkg-dev recommends it and all debian package builds tend to have it
available implicitly.
Allows using new special remote messages when git-annex supports them,
and avoiding using them when git-annex is too old. The new INFO is one
such message.
There's also the possibility, currently unused, for the special remote's
reply to include some kind of extensions of its own.
Merging this is blocked by https://github.com/datalad/datalad/issues/2124
since it seems it will break datalad. I checked all the other special
remotes and they will be ok.
This commit was supported by the NSF-funded DataLad project.
It's left up to the special remote to detect when git-annex is new enough
to support the message; an old git-annex will blow up.
This commit was supported by the NSF-funded DataLad project.
Added remote.<name>.annex-checkuuid config, which can be set to false to
disable the default checking of the uuid of remotes that point to
directories. This can be useful to avoid unncessary drive spin-ups and
automounting.
Note that the UUID check is still done before writing to the repository,
to avoid writing to the wrong repository if it got relocated. Check is
also done before checkPresent to avoid getting confused about what is in
which repo. This is effectively the same as the use of git-annex-shell
with a uuid to check that the remote repository is the expected one.
Did not bother with the check for retrieveKeyFile because it doesn't
matter if the wrong repo is used then.
This commit was sponsored by Trenton Cronholm on Patreon.
And for tab completion, by not unnessessarily statting paths to remotes,
which used to cause eg, spin-up of removable drives.
Got rid of the remotes member of Git.Repo. This was a bit painful.
Remote.Git modifies the list of remotes as it reads their configs,
so still need a persistent list of remotes. So, put it in as
Annex.gitremotes. It's only populated by getGitRemotes, so commands
like examinekey that don't care about remotes won't do so.
This commit was sponsored by Jake Vosloo on Patreon.
git grep writeFile finds some more that might also be problems, but
for now I've concentrated on .git/annex/ log files. There are certianly
cases where writeFile is not a problem too.
This commit was sponsored by mo on Patreon.
Fourth or fifth try at this and finally found a way to make it work.
Absurd amount of busy-work forced on me by change in cabal's behavior.
Split up Utility modules that need posix stuff out of ones used by
Setup. Various other hacks around inability for Setup to use anything
that ifdefs a use of unix.
Probably lost a full day of my life to this.
This is how build systems make their users hate them. Just saying.
And also now in non-fast mode, since it was just changed to query for the
filename separately.
And avoid processTranscript which mixed up stdout and stderr and could have
led to weirdness if there were warnings that didn't get suppressed.
addurl: When the file youtube-dl will download is already an annexed file,
don't download it again and fail to overwrite it, instead just do nothing,
like it used to when quvi was used.
This commit was sponsored by Anthony DeRobertis on Patreon.
This reverts commit 51228c2306.
No, still doesn't work when built with cabal. It did with stack; stack
must somehow make the unix package implicitly available.
With cabal, System.Posix.Process and System.Posix.Env are both missing.
Seems I had all the work in past commits to make this build, at least on
linux. I'm actually surprised it does, without a unix dep, Utility.Env
still builds ok somehow despite using System.Posix.Env.
This commit was sponsored by Fernando Jimenez on Patreon.
Chose to make this only handle files actively being downloaded, not temp
files for downloads that were interrupted or files that have been fully
downloaded.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Test suite is always included.
Building with this flag disabled has actually been broken for some time,
since Command.TestRemote uses tasty. Fewer build flags are better, so good
time to drop it.
This commit was sponsored by Thomas Hochstein on Patreon.