This cleans up after the bug that was fixed in commit
6a9e923c74
Object files that were stored in the wrong location are rescued,
and after that any wrong location logs will be fixed by the usual fsck.
Used protectedOutput to set up a umask that makes the socket only
accessible by the current user.
Authentication is still needed when using this option unless it is combined
with --wideopen. It was just simpler to keep authentication separate from
this.
This allows for eg dir/user/repo structure. But also other layouts. It
still does not look for repositories that are nested inside other
repositories.
The check for symlinks is mostly to avoid cycles that would prevent
findRepos from returning. Eg, foo/bar/baz being a symlink to foo/bar.
If the directory is writable by someone else they can still race it and
get it to follow a symlink to some other directory. I don't think p2phttp
needs to worry about that kind of situation though, and I doubt it avoids
such problems when operating on files in a git-annex repository either.
As groundwork for making git-annex p2p support other P2P networks than
tor hidden services, when an AuthToken is not a TorAnnex value, but
something else (that will be added later), store the P2PAddress that it
will be used with along with the AuthToken. And in loadP2PAuthTokens,
only return AuthTokens for the specified P2PAddress.
See commit 2de27751d6 for some design work
that led to this.
Also, git-annex p2p --gen-addresses is changed to generate a separate
AuthToken for every P2P address. Rather than generating a single
AuthToke and using it for every one. When we have more than just tor,
this will be important for security, to avoid a compromise of one P2P
network exposing the AuthToken used for another network.
sync: Push the current branch first, rather than a synced branch, to better
support git forges (gitlab, gitea, forgejo, etc.) which use push-to-create
with the first pushed branch becoming the default branch.
With considerable complication to filter out warning message about
receive.denyCurrentBranch when pushing to a non-bare repository. Localization
may break it in the future, but it seems like the best way to handle this. See
my comments for the gory details.
Also fixes it in the graphviz map in some cases, where there is no
description for a repository.
And in json, use the remote name, never the description, since the field
is "remote" which is intended to be the git remote name.
Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
Avoid using "name" for what git-annex otherwise refers to as a
description.
(For the remotes in the map, the "remote" field should be the remote
name, but there is a bug preventing it from being that.)
Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
It was treating remote paths of a remote repo as if they were local paths,
and so trying to expand git directories and so forth on them. That led to
bad results, including a path like "foo.git" getting turned into
"foo.git.git"
Sponsored-by: Dartmouth College's OpenNeuro project
Which is a per-remote version of the annex.web-options config.
Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a
special remote does not use curl, there was no need to do that and I used
Nothing instead.
In the case of the addurl and importfeed commands, it seemed best to say
that running these commands is not using the web special remote per se,
so the config is not used for those commands.
requiredContentMap does not exclude dead repos. Usually this is not a
problem because it is used when we are operating on a repository, and in
that case, the repository is not dead (or if it is, the required content
configurations should still be used). But in the case of fsck, this made a
old required content config for a dead repository be warned about in a
situation where it is not a problem.
updatecluster, updateproxy: When a remote that has no annex-uuid is
configured as annex-cluster-node, warn and avoid writing bad data to the
git-annex branch.
The proxy.log and cluster.log end up unparseable when a NoUUID gets written
to them.
When writing doc/tips/computing_annexed_files.mdwn, I noticed
that a recompute --reproducible followed by a drop and a re-get did not
actually test if the file could be reproducible computed again.
Turns out that get and drop both operate on staged files. If there is an
unstaged modification in the work tree, that's ignored. Somewhat
surprisingly, other commands like info do operate on staged files. So
behavior is inconsistent, and fairly surprising really, when there are
unstaged modifications to files.
Probably this is rarely noticed because `git-annex add` is used to add a
new version of a file, and then it's staged. Or `git mv` is used to move
a file, rather than `mv` of a file over top of an existing file. So it's
uncommon to have an unstaged annexed file in a worktree.
It might be worth making things more consistent, but that's out of scope
for what I'm working on currently.
Also, I anticipate that supporting unlocked files with recompute will
require it to stage changes anyway.
So, make recompute stage the new version of the file.
I considered having recompute refuse to overwrite an existing staged
file. After all, whatever version was staged before will get lost when
the new version is staged over top of it. But, that's no different than
`git-annex addcomputed` being run with the name of an existing staged
file. Or `git-annex add` being run with a new file content when there is
an existing staged file. Or, for that matter, `git add` being ran with a
new content when there is an existing staged file.
Used by git-annex-compute-singularity to make addcomputed --fast work.
Also, simplified git-annex-compute-singularity; there is no need to hard
link the container into place. singularity does not care about the
extension of the container, so can just pass it the annex object file.
In this case, the compute program is run the same as if addcomputed --fast
were used, so it should succeed, without outputting a computed file.
computeInputsUnavailable is in ComputeState for simplicity, but it is
not serialized with the rest of the ComputeState.
Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.
Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.
Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.
I've lost track of them all, but it includes:
* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
different, but the key remains the same, makes the object file
be updated with the new content
* Detecting some other ways the program behavior can change, just for
completeness.
* Also adds --backend to addcomputed.
When a computed file has been renamed, a recompute needs to write to the
new filename.
I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.
Added temporary TODO-compute file
Proper behavior without --others implemented.
And eliminated most of the code duplication through refactoring.
Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.
The perform action of this still needs work to do the right thing.
In particular, it currently behaves as if --others was always set.
And, it duplicates a lot of code from addcomputed.
This is limited because the remote config is a field/value map. So order
is not preserved, and when 2 parameters have the same field name, only
the last one will be passed.