Commit graph

3050 commits

Author SHA1 Message Date
Joey Hess
9f4e956346
sync: push current branch first
sync: Push the current branch first, rather than a synced branch, to better
support git forges (gitlab, gitea, forgejo, etc.) which use push-to-create
with the first pushed branch becoming the default branch.

With considerable complication to filter out warning message about
receive.denyCurrentBranch when pushing to a non-bare repository. Localization
may break it in the future, but it seems like the best way to handle this. See
my comments for the gory details.
2025-06-04 12:06:00 -04:00
Joey Hess
f167e7f55b
adjust annex.synccontent transition warning
sync will also be changing to drop unwanted content by default, this
wording change avoids leaving the wrong impression
2025-05-30 14:30:01 -04:00
Joey Hess
f6eac67f0e
rename repoName to repoDesc
That's what the function mostly is, if it shows a remote name it's only
in an edge case, where that is the best description of it available.
2025-05-29 12:55:40 -04:00
Joey Hess
2fad57de44
fix display of remote name in json
Also fixes it in the graphviz map in some cases, where there is no
description for a repository.

And in json, use the remote name, never the description, since the field
is "remote" which is intended to be the git remote name.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2025-05-29 12:53:42 -04:00
Joey Hess
a44638ca73
adjust json field names
Avoid using "name" for what git-annex otherwise refers to as a
description.

(For the remotes in the map, the "remote" field should be the remote
name, but there is a bug preventing it from being that.)

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2025-05-29 12:42:53 -04:00
Joey Hess
52a8b5b117
map: Support --json option
Sponsored-by: Dartmouth College's OpenNeuro project
2025-05-28 14:17:28 -04:00
Joey Hess
286a681b57
remove dangling where 2025-05-20 09:37:33 -04:00
Joey Hess
e64e9d5fae
whereused: Fix bug that could find matches from grafts in remote git-annex branches
git log with --remotes= needs the preceeding --exclude=*/git-annex in order
to not look at git-annex branches of remotes.

Sponsored-by: mycroft
2025-05-05 14:32:25 -04:00
Joey Hess
2ee6c25c72
map: Fix buggy handling of remotes that are bare git repositories accessed via ssh
It was treating remote paths of a remote repo as if they were local paths,
and so trying to expand git directories and so forth on them. That led to
bad results, including a path like "foo.git" getting turned into
"foo.git.git"

Sponsored-by: Dartmouth College's OpenNeuro project
2025-04-22 15:21:01 -04:00
Joey Hess
7b3d7a8f78
fix message
also dead code removal
2025-04-22 13:36:54 -04:00
Joey Hess
7fb413189a
migrate: Fix --remove-size to work when a file is not present
5f74a45861 added this bug
2025-04-01 10:47:31 -04:00
Joey Hess
e81fd72018
Added remote.name.annex-web-options config
Which is a per-remote version of the annex.web-options config.

Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a
special remote does not use curl, there was no need to do that and I used
Nothing instead.

In the case of the addurl and importfeed commands, it seemed best to say
that running these commands is not using the web special remote per se,
so the config is not used for those commands.
2025-04-01 10:17:38 -04:00
Joey Hess
cc8f7e9776
fsck: Avoid complaining about required content of dead repositories
requiredContentMap does not exclude dead repos. Usually this is not a
problem because it is used when we are operating on a repository, and in
that case, the repository is not dead (or if it is, the required content
configurations should still be used). But in the case of fsck, this made a
old required content config for a dead repository be warned about in a
situation where it is not a problem.
2025-03-26 10:30:33 -04:00
Joey Hess
d0b5a09b0e
deal with NoUUID in checkCanProxy
updatecluster, updateproxy: When a remote that has no annex-uuid is
configured as annex-cluster-node, warn and avoid writing bad data to the
git-annex branch.

The proxy.log and cluster.log end up unparseable when a NoUUID gets written
to them.
2025-03-21 12:29:44 -04:00
Joey Hess
74457b6b93
findcompute --inputs
Useful for eg, generating dependency graphs.
2025-03-19 15:39:05 -04:00
Joey Hess
bcfd554a0f
findcomputed: New command, displays information about computed files. 2025-03-18 12:55:48 -04:00
Joey Hess
d74d2d5d91
--json for addcomputed and recompute
Not very useful, but it does work.
2025-03-17 15:51:43 -04:00
Joey Hess
2d60ce4803
record fscked files in fsck db by default
Remember the files that are checked, so a later run with --more will
skip them, without needing to use --incremental.
2025-03-17 15:34:08 -04:00
Joey Hess
23538ea17b
annex.addunlocked support for git-annex compute
And for git-annex recompute, add the file unlocked when the original is
unlocked.
2025-03-17 14:26:09 -04:00
Joey Hess
a673fc7cfd
recompute: stage new version of file in git
When writing doc/tips/computing_annexed_files.mdwn, I noticed
that a recompute --reproducible followed by a drop and a re-get did not
actually test if the file could be reproducible computed again.

Turns out that get and drop both operate on staged files. If there is an
unstaged modification in the work tree, that's ignored. Somewhat
surprisingly, other commands like info do operate on staged files. So
behavior is inconsistent, and fairly surprising really, when there are
unstaged modifications to files.

Probably this is rarely noticed because `git-annex add` is used to add a
new version of a file, and then it's staged. Or `git mv` is used to move
a file, rather than `mv` of a file over top of an existing file. So it's
uncommon to have an unstaged annexed file in a worktree.

It might be worth making things more consistent, but that's out of scope
for what I'm working on currently.

Also, I anticipate that supporting unlocked files with recompute will
require it to stage changes anyway.

So, make recompute stage the new version of the file.

I considered having recompute refuse to overwrite an existing staged
file. After all, whatever version was staged before will get lost when
the new version is staged over top of it. But, that's no different than
`git-annex addcomputed` being run with the name of an existing staged
file. Or `git-annex add` being run with a new file content when there is
an existing staged file. Or, for that matter, `git add` being ran with a
new content when there is an existing staged file.
2025-03-12 13:42:00 -04:00
Joey Hess
0712ae020c
fix recompute --reproducible run on a VURL key
This avoids "Cannot generate a key for backend VURL", and makes it use
the usual hashing backend.
2025-03-12 11:48:29 -04:00
Joey Hess
0477a8d098
add INPUT-REQUIRED
Used by git-annex-compute-singularity to make addcomputed --fast work.

Also, simplified git-annex-compute-singularity; there is no need to hard
link the container into place. singularity does not care about the
extension of the container, so can just pass it the annex object file.
2025-03-11 11:46:31 -04:00
Joey Hess
c6c6e2632d
avoid unncessary git-annex branch changes for recompute and addcomputed 2025-03-06 12:41:30 -04:00
Joey Hess
ccc454a791
computation progress display 2025-03-05 13:46:06 -04:00
Joey Hess
51538fa0a8
improve error message when unable to get an input file
In this case, the compute program is run the same as if addcomputed --fast
were used, so it should succeed, without outputting a computed file.

computeInputsUnavailable is in ComputeState for simplicity, but it is
not serialized with the rest of the ComputeState.
2025-03-04 13:13:18 -04:00
Joey Hess
b395bd4f56
move showOutput into compute remote 2025-03-04 10:02:33 -04:00
Joey Hess
89bfeada87
recompute: display one of the changed files 2025-03-03 15:12:19 -04:00
Joey Hess
b01a0d2323
avoid recomputing every time on git inputs 2025-03-03 14:56:49 -04:00
Joey Hess
a0d6a6ea2a
support git files as input to computations
Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.

Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.

Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.
2025-03-03 12:09:25 -04:00
Joey Hess
6ebab7fb00
factor out Annex.GitShaKey 2025-03-03 11:09:28 -04:00
Joey Hess
63d73d8d1b
record VURL key hashes in addcomputed and recompute 2025-03-03 10:57:56 -04:00
Joey Hess
b813549b2d
fix build 2025-02-27 16:18:04 -04:00
Joey Hess
e6ae5e8d56
many recompute improvements
I've lost track of them all, but it includes:

* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
  state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
  different, but the key remains the same, makes the object file
  be updated with the new content
* Detecting some other ways the program behavior can change, just for
  completeness.
* Also adds --backend to addcomputed.
2025-02-27 15:18:27 -04:00
Joey Hess
9c2c3002a6
fix recompute of renamed files
When a computed file has been renamed, a recompute needs to write to the
new filename.

I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.

Added temporary TODO-compute file
2025-02-27 11:27:26 -04:00
Joey Hess
5d2a608a56
todo 2025-02-26 15:59:47 -04:00
Joey Hess
d6a010a615
recompute closer to working properly
Proper behavior without --others implemented.

And eliminated most of the code duplication through refactoring.

Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.
2025-02-26 15:52:52 -04:00
Joey Hess
53d107ca47
refactor 2025-02-26 14:05:37 -04:00
Joey Hess
3bec89a3c3
started git-annex recompute
The perform action of this still needs work to do the right thing.
In particular, it currently behaves as if --others was always set.
And, it duplicates a lot of code from addcomputed.
2025-02-26 11:54:09 -04:00
Joey Hess
d49f371acc
showOutput
when the compute program eg displays usage, it needs to start on its own
line
2025-02-26 09:47:56 -04:00
Joey Hess
eed522a0f8
addcomputed inherits extra initremote parameters
This is limited because the remote config is a field/value map. So order
is not preserved, and when 2 parameters have the same field name, only
the last one will be passed.
2025-02-26 09:45:35 -04:00
Joey Hess
a5b53fa98a
todo 2025-02-25 18:45:55 -04:00
Joey Hess
e702cb94ff
add compute remote uuid to compute state url
Otherwise, two different compute remotes that happen to take the same
input would use the same compute state url. Which seems wrong.
2025-02-25 18:44:40 -04:00
Joey Hess
71e92a509a
use compute program REPRODUCIBLE by default 2025-02-25 17:10:41 -04:00
Joey Hess
233a6954b9
ingest when --unreproducible is used without --fast 2025-02-25 17:04:19 -04:00
Joey Hess
16f529c05f
addcomputed --fast and --unreproducible working
For these, use VURL and URL keys, with an "annex-compute:" URI prefix.

These URL keys will look something like this:

	URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462

Generally it's too long so most of it gets md5summed. It's a little
ugly, but it's what fell out of the existing URL key generation
machinery. I did consider special casing to eg
"URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at
least possibly useful that the name of the file that was computed is
visible and perhaps one or two words of the git-annex compute command
parameters.

Note that two different output files from the same computation will get
the same URL key. And these keys should remain stable.
2025-02-25 16:43:15 -04:00
Joey Hess
a154e91513
add git-annex addcomputed
Working pretty well. Mostly. But:

* Does not yet support inputs that are non-annexed files checked into git
* --fast is currently broken (will need something like VURL keys)
* --unreproducible still uses a checksumming backend, so drop and get
  again will likely fail (needs probably to use an URL key or something
  like one)

The compute special remote seems to work pretty well too. Eg,
getting from it works, and dropping content that is present in it works.
2025-02-25 15:50:08 -04:00
Joey Hess
4f1eea9061
remove unused adjustedBranchRefresh associated file parameter 2025-02-21 14:51:02 -04:00
Joey Hess
f8bb9a8734
replace removeLink with removeFile
same reasoning as in commit 5cc8d9d03b
2025-02-11 13:41:26 -04:00
Joey Hess
3bbabd6778
replace R.doesPathExist with doesPathExist
Equivilant, just avoids some ugliness.
2025-02-11 12:46:54 -04:00
Joey Hess
2ff716be30
OsPath build flag no longer depends on filepath-bytestring
However, filepath-bytestring is still in Setup-Depends.
That's because Utility.OsPath uses it when not built with OsPath.
It would be maybe possible to make Utility.OsPath fall back to using
filepath, and eliminate that dependency too, but it would mean either
wrapping all of System.FilePath's functions, or using `type OsPath = FilePath`

Annex.Import uses ifdefs to avoid converting back to FilePath when not
on windows. On windows it's a bit slower due to that conversion.
Utility.Path.Windows.convertToWindowsNativeNamespace got a bit
slower too, but not really worth optimising I think.

Note that importing Utility.FileSystemEncoding at the same time as
System.Posix.ByteString will result in conflicting definitions for
RawFilePath. filepath-bytestring avoids that by importing RawFilePath
from System.Posix.ByteString, but that's not possible in
Utility.FileSystemEncoding, since Setup-Depends does not include unix.
This turned out not to affect any code in git-annex though.

Sponsored-by: Leon Schuermann
2025-02-10 16:39:55 -04:00