Added remote.name.annex-security-allow-unverified-downloads, a per-remote
setting for annex.security.allow-unverified-downloads.
This commit was sponsored by Brock Spratlen on Patreon.
* init: Improve generated post-receive hook, so it won't fail when
run on a system whose git-annex is too old to support git-annex post-receive
* init: Update the post-receive hook when re-run in an existing repository.
This commit was sponsored by Jack Hill on Patreon.
This reverts commit b18fb1e343.
That broke support for old git-annex-shell before p2pstdio was added.
The immediate problem is that postAuth had a fallthrough case
that sent an error back to the peer, but sending an error back when the
connection is closed is surely not going to work.
But thinking about it some more, making every function that uses receiveMessage
need to handle ProtocolEOF adds a lot of complication, so I don't want
to do that.
The commit only cleaned up the test suite output a tiny bit, so I'm just
gonna revert it for now.
download is documented as displaying an error when download fails, but
it didn't when the url was not valid at all. That leads to confusing
behavior.
Also, display the url with --debug
Thus the ill-specified "edit with care" at the bottom of this page,
which basically means you need to build the man pages and check
that the markdown used was converted ok.
Also, this doc wiki uses relative links, not absolute links.
Also reworded slightly.
Added annex.maxextensionlength for use cases where extensions longer than 4
characters are needed.
This commit was sponsored by Henrik Riomar on Patreon.
Untested, on FreeBSD but enough to fix the listed build errors.
Seems that System.Posix.Files must have used to export this stuff and it
was split.
This commit was sponsored by Peter on Patreon.
Added -z option to git-annex commands that use --batch, useful for
supporting filenames containing newlines.
It only controls input to --batch, the output will still be line delimited
unless --json or etc is used to get some other output. While git often
makes -z affect both input and output, I don't like trying them together,
and making it affect output would have been a significant complication,
and also git-annex output is generally not intended to be machine parsed,
unless using --json or a format option.
Commands that take pairs like "file key" still separate them with a space
in --batch mode. All such commands take care to support filenames with
spaces when parsing that, so there was no need to change it, and it would
have needed significant changes to the batch machinery to separate tose
with a null.
To make fromkey and registerurl support -z, I had to give them a --batch
option. The implicit batch mode they enter when not provided with input
parameters does not support -z as that would have complicated option
parsing. Seemed better to move these toward using the same --batch as
everything else, though the implicit batch mode can still be used.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Work around git cat-file --batch's protocol not supporting newlines by
running git cat-file not batched and passing the filename as a
parameter.
Of course this is quite a lot less efficient, especially because it
currently runs it multiple times to query for different pieces of
information.
Also, it has subtly different behavior when the batch process was
started and then some changes were made, in which case the batch process
sees the old index but this workaround sees the current index. Since
that batch behavior is mostly a problem that affects the assistant and has
to be worked around in it, I think I can get away with this difference.
I don't know of any other problems with newlines in filenames, everything
else in git I can think of supports -z. And git-annex's json output
supports newlines in filenames so downstream parsers from git-annex will be ok.
git-annex commands that use --batch themselves don't support newlines
in input filenames; using --json --batch is currently a way around that
problem.
This commit was sponsored by Ewen McNeill on Patreon.
v6: When a file is unlocked but has not been modified, and the unlocking is
only staged, git-annex add did not lock it. Now it will, for consistency
with how modified files are handled and with v5.
Note the removal of the sameInodeCache check. Otherwise it would see
that the unmodified file is unmodified and stop there. That check seems to have
been copied from the direct mode branch. But, direct mode had a specific
reason to check for unmodified content, that does not apply to v6.
The second pass means there is potential for a race, eg the unlocked
file could be modified in between the first and second passes.
No problem with that, since both passes do the same thing.
This commit was sponsored by Jake Vosloo on Patreon.
* Don't use GIT_PREFIX when GIT_WORK_TREE=. because it seems git
does not intend GIT_WORK_TREE to be relative to GIT_PREFIX in that
case, despite GIT_WORK_TREE=.. being relative to GIT_PREFIX.
* Don't use GIT_PREFIX to fix up a relative GIT_DIR, because
git 2.11 sets GIT_PREFIX set to a path it's not relative to.
and apparently GIT_DIR is never relative to GIT_PREFIX.
Commit e50ed4ba48 led us down this path
by working around a git bug by relying on the barely documented GIT_PREFIX.
This commit was sponsored by Trenton Cronholm on Patreon.
Makes git annex whereis display the versionId urls.
And, when a s3 remote is enabled without creds, git-annex will use the
versionId urls to access its contents.
This commit was sponsored by Fernando Jimenez on Patreon.
v6: Fix annex object file permissions when git-annex add is run on a
modified unlocked file, and in some related cases.
If a hard link is made, don't freeze it; annex.thin
uses writable object files.
Also: For some reason, linkToAnnex used to thawContent src. I can see no
reason why it needed to do that, so I eliminated that.
This commit was sponsored by Brock Spratlen on Patreon.
Since the same key can be stored in a versioned S3 bucket multiple times
with different version IDs, this allows tracking them all. Not currently
needed, but if we ever want to drop from a versioned S3 bucket, we'll
need to know them all.
This commit was supported by the NSF-funded DataLad project.
Actually very straightforward reuse of the metadata log file code.
Although I had to add a todo item as git-annex forget won't clean up
dead remote's metadata yet.
This would be worth adding to the external special remote interface
sometime. Have not opened a todo though, guess I'll wait until something
needs it.
This commit was supported by the NSF-funded DataLad project.
Have to store the S3 object along with the version ID, so retrieval can
use the same object.
This commit was supported by the NSF-funded DataLad project.
Only done when versioning=yes is configured. It could always do it when
S3 sends back a version id, but there may be buckets that have
versioning enabled by accident, so it seemed better to honor the
configuration.
S3's docs say version IDs are "randomly generated", so presumably
storing the same content twice gets two different ones not the same one.
So I considered storing a list of version IDs for a key. That would
allow removing the key completely. But.. The way Logs.RemoteState works,
when there are multiple writers, the last writer wins. So storing a list
would need a different log format that merges, which seemed overkill to support
removing a key from an append-only remote.
Note that Logs.RemoteState for S3 is now dedicated to version IDs.
If something else needs to be stored, a new log will be needed to do it.
This commit was supported by the NSF-funded DataLad project.
Make exporttree=yes remotes that are appendonly not be untrusted, and not force
verification of content, since the usual concerns about losing data when an
export is updated by someone else don't apply.
Note that all the remote operations on keys are left as usual for
appendonly export remotes, except for storing content.
This commit was supported by the NSF-funded DataLad project.
Make `git annex export` check appendonly when removing a file from an
export, and not update the location log, since the remote still contains
the content.
This commit was supported by the NSF-funded DataLad project.
Probably not noticed until now because the queue is large enough that two
threads each filling theirs at the same time and flushing is unlikely to
happen.
Also made explicit that each worker thread gets its own queue.
I think that was the case before, but if something was put in the queue
before worker threads were forked off, they could have each inherited the
same queue.
Could have gone with a single shared queue, but per-worker queues is more
efficient, because a worker can add lots of stuff to its own queue without
any locking.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Avoids annex.largefiles inconsitency and also avoids a lot of
unneccessary calls to the clean filter when a large repo's clone
is being initialized.
This commit was supported by the NSF-funded DataLad project.
v6: When annex.largefiles is not configured for a file, running git add or
git commit, or otherwise using git to stage a file will add it to the annex
if the file was in the annex before, and to git otherwise. This is to avoid
accidental conversion.
Note that git-annex add's behavior has not changed, for reasons explained
in the added comment.
Performance: No added overhead when annex.largefiles is configured.
When not configured, there is an added call to catObjectMetaData,
which involves a round trip through git cat-file --batch.
However, the earlier catKeyFile primes the cache for it.
This commit was supported by the NSF-funded DataLad project.
Last of the known v6 races.
This also makes git add of a pointer file populate it when its content
is present in the annex. Which makes sense to do, I think.
This commit was supported by the NSF-funded DataLad project.
Update pointer file next time reconcileStaged is run to recover from the
race.
Note that restagePointerFile causes git to run the clean filter,
and that will run reconcileStaged. So, normally by the time the git
annex get/drop command finishes, the race has already been dealt with.
It may be that, in some case, that won't happen and the race will be
dealt with at a later point. git-annex could run reconcileStaged at
shutdown if that becomes a problem.
This does not handle the situation where the git mv is committed before
git-annex gets a chance to run again. git commit does run the clean
filter, and that happens to re-inject the content if it was supposed to
be dropped but is still populated. But, the case where the file was
supposed to be gotten but is not populated is not handled yet.
This commit was supported by the NSF-funded DataLad project.