Commit graph

831 commits

Author SHA1 Message Date
Joey Hess
a0d6a6ea2a
support git files as input to computations
Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.

Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.

Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.
2025-03-03 12:09:25 -04:00
Joey Hess
a149336a59
OsPath transition Windows build fixes
This gets it building on Windows again, with 1 test suite failure
(addurl).

Sponsored-by: Kevin Mueller
2025-02-11 15:40:53 -04:00
Joey Hess
9dc43396b3
fix comment 2025-02-11 14:07:01 -04:00
Joey Hess
27a0bacc49
improved OsPath conversion 2025-02-11 14:05:56 -04:00
Joey Hess
780a379ab1
remove unused functions from Utility.RawFilePath 2025-02-11 13:49:17 -04:00
Joey Hess
2ff716be30
OsPath build flag no longer depends on filepath-bytestring
However, filepath-bytestring is still in Setup-Depends.
That's because Utility.OsPath uses it when not built with OsPath.
It would be maybe possible to make Utility.OsPath fall back to using
filepath, and eliminate that dependency too, but it would mean either
wrapping all of System.FilePath's functions, or using `type OsPath = FilePath`

Annex.Import uses ifdefs to avoid converting back to FilePath when not
on windows. On windows it's a bit slower due to that conversion.
Utility.Path.Windows.convertToWindowsNativeNamespace got a bit
slower too, but not really worth optimising I think.

Note that importing Utility.FileSystemEncoding at the same time as
System.Posix.ByteString will result in conflicting definitions for
RawFilePath. filepath-bytestring avoids that by importing RawFilePath
from System.Posix.ByteString, but that's not possible in
Utility.FileSystemEncoding, since Setup-Depends does not include unix.
This turned out not to affect any code in git-annex though.

Sponsored-by: Leon Schuermann
2025-02-10 16:39:55 -04:00
Joey Hess
c74c75b352
more OsPath conversion (639/749)
Sponsored-by: k0ld
2025-02-07 16:07:05 -04:00
Joey Hess
a5d48edd94
more OsPath conversion (602/749)
Sponsored-by: Brock Spratlen
2025-02-07 14:46:11 -04:00
Joey Hess
0811531b59
more OsPath conversion (542/749)
Sponsored-by: Luke T. Shumaker
2025-02-06 11:38:14 -04:00
Joey Hess
b28433072c
more OsPath conversion (475/749)
Sponsored-by: Nicholas Golder-Manning
2025-02-05 12:14:56 -04:00
Joey Hess
4dc904bbad
more OsPath conversion
Sponsored-by: Leon Schuermann
2025-02-04 16:09:47 -04:00
Joey Hess
5cc8d9d03b
replace removeLink with removeFile
removeFile calls unlink so removes anything not a directory. So these
are replaceable in order to convert to OsPath.
2025-02-02 14:16:58 -04:00
Joey Hess
71195cce13
more OsPath conversion
Sponsored-by: k0ld
2025-02-01 14:06:38 -04:00
Joey Hess
474cf3bc8b
more OsPath conversion
Sponsored-by: Brock Spratlen
2025-02-01 11:54:19 -04:00
Joey Hess
c69e57aede
more OsPath conversion
Sponsored-by: Jack Hill
2025-01-30 15:46:32 -04:00
Joey Hess
97c83152d6
Merge branch 'master' into ospath 2025-01-29 18:48:02 -04:00
Joey Hess
998de2e7ce
remove temp debugging code 2025-01-29 17:19:01 -04:00
Joey Hess
415455883c
debug test suite crash on windows 2025-01-29 16:37:54 -04:00
Joey Hess
27305042f3
more OsPath conversion
Sponsored-by: Nicholas Golder-Manning
2025-01-29 11:53:20 -04:00
Joey Hess
f3539efc16
more OsPath conversion
Sponsored-by: Leon Schuermann
2025-01-24 16:31:14 -04:00
Joey Hess
aa0f3f31da
more OsPath conversion
Sponsored-by: Eve
2025-01-24 15:02:29 -04:00
Joey Hess
c412c59ecd
more OsPath conversion
About 1/10th done with this I think.
2025-01-24 13:40:44 -04:00
Joey Hess
ea775baccd
more OsPath conversion
Git.Types now uses it, as does TopFilePath, making for plenty of new
compile errors needing fixing.

Sponsored-by: Brock Spratlen
2025-01-23 16:15:00 -04:00
Joey Hess
c3c8870752
add System.FilePath to this conversion
It seems to make sense to convert both System.Directory and
System.FilePath uses to OsPath in one go. This will generally look like
replacing RawFilePath with OsPath in type signatures, and will be driven
by the now absolutely massive pile of compile errors.

Got a few modules building in this new regime.

Sponsored-by: Jack Hill
2025-01-23 11:07:29 -04:00
Joey Hess
77e9781ae2
parsePOSIXTime ByteString conversion
Some easy (though tiny) speed wins.

Sponsored-by: Luke T. Shumaker on Patreon
2025-01-22 16:42:09 -04:00
Joey Hess
6e27b0d4d1
convert from readFileStrict
This removes that function, using file-io readFile' instead.

Had to deal with newline conversion, which readFileStrict does on
Windows. In a few cases, that was pretty ugly to deal with.

Sponsored-by: Kevin Mueller
2025-01-22 16:20:36 -04:00
Joey Hess
9b79f0f43d
use file-io for readFile/writeFile/appendFile on ByteStrings
These are all straightforward, and easy small performance wins.

Sponsored-by: Nicholas Golder-Manning
2025-01-22 14:30:25 -04:00
Joey Hess
793ddecd4b
use openTempFile from file-io
And follow-on changes.

Note that relatedTemplate was changed to operate on a RawFilePath, and
so when it counts the length, it is now the number of bytes, not the
number of code points. This will just make it truncate shorter strings
in some cases, the truncation is still unicode aware.

When not building with the OsPath flag, toOsPath . fromRawFilePath and
fromRawFilePath . fromOsPath do extra conversions back and forth between
String and ByteString. That overhead could be avoided, but that's the
non-optimised build mode, so didn't bother.

Sponsored-by: unqueued
2025-01-22 11:41:43 -04:00
Joey Hess
1ceece3108
RawFilePath conversion of System.Directory
By using System.Directory.OsPath, which takes and returns OsString,
which is a ShortByteString. So, things like dirContents currently have the
overhead of copying that to a ByteString, but that should be less than
the overhead of using Strings which often in turn were converted to
RawFilePaths.

Added Utility.OsString and the OsString build flag. That flag is turned
on in the stack.yaml, and will be turned on automatically by cabal when
built with new enough libraries. The stack.yaml change is a bit ugly,
and that could be reverted for now if it causes any problems.

Note that Utility.OsString.toOsString on windows is avoiding only a
check of encoding that is documented as being unlikely to fail. I don't
think it can fail in git-annex; if it could, git-annex didn't contain
such an encoding check before, so at worst that should be a wash.
2025-01-20 19:17:33 -04:00
Joey Hess
104ca5e09e
Support help.autocorrect settings "never" and "immediate" 2025-01-20 11:01:07 -04:00
Joey Hess
b0ef04f0b7
Support help.autocorrect=prompt 2025-01-20 10:56:12 -04:00
Joey Hess
a73fa77417
added hooks corresponding to annex.*-command
* Added freezecontent-annex and thawcontent-annex hooks that
  correspond to the git configs annex.freezecontent and
  annex.thawcontent.
* Added secure-erase-annex hook that corresponds to the git config
  annex.secure-erase-command.
* Added commitmessage-annex hook that corresponds to the git config
  annex.commitmessage-command.
* Added http-headers-annex hook that corresponds to the git config
  annex.http-headers-command.
  that correspond to the post-update-annex and pre-commit-annex hooks.

The use case for these is eg, setting up a git repository that is run in a
container, where the easiest way to provide a script is by putting it in
.git/hooks/, rather than copying it into the container in a way that puts
it in PATH.

This is all the ones that make sense to add for annex.*-config git configs.
annex.youtube-dl-command is not a hook, it's telling git-annex what command
to run. So is annex.shared-sop-command. So omitted those.

May later also want to add hooks corresponding to
`remote.<name>.annex-cost-command` etc.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2025-01-10 14:54:42 -04:00
Joey Hess
a19a3076b5
ssh exit status 255 is a connection problem
Previously, when the git config was unable to be read from a ssh remote,
it would try to git fetch from it to determine if the remote was
otherwise accessible. That was unnessary work, since exit status 255
indicates a connection problem.

As well as avoiding the extra work of the fetch, this also improves
things when a ssh remote cannot be connected to due to a problem with
the git-annex ssh control socket. In that situation, ssh will also exit 255.
Before, the git fetch was tried in that situation, and would succeed, since
it does not use the git-annex ssh control socket. git-annex would conclude
that git-annex-shell was not installed on the remote, which could be wrong.

I suppose it also used to be possible for the user to need to enter a
ssh password on each connection to the remote. If they entered the wrong
password for the git-annex-shell call, but then the right password for
the git fetch, it would also incorrectly set annex-ignore, and that
situation is also now fixed.
2025-01-03 14:46:16 -04:00
Joey Hess
dd052dcba1
annexInsteadOf config
Added config `url.<base>.annexInsteadOf` corresponding to git's
`url.<base>.pushInsteadOf`, to configure the urls to use for accessing the
git-annex repositories on a server without needing to configure
remote.name.annexUrl in each repository.

While one use case for this would be rewriting urls to use annex+http,
I decided not to add any kind of special case for that. So while
git-annex p2phttp, when serving multiple repositories, needs an url
of eg "annex+http://example.com/git-annex/ for each of them, rewriting an
url like "https://example.com/git/foo/bar" with this config set to
"https://example.com/git/" will result in eg
"annex+http://example.com/git-annex/foo/bar", which p2phttp does not
support.

That seems better dealt with in either git-annex p2phttp or a http
middleware, rather than complicating the config with a special case for
annex+http.

Anyway, there are other use cases for this that don't involve annex+http.
2024-12-03 14:39:07 -04:00
Joey Hess
0c08ff3d2c
deal with git's CFLR nonsense once again
Work around git hash-object --stdin-paths's odd stripping of carriage
return from the end of the line (some windows infection), avoiding crashing
when the repo contains a filename ending in a carriage return.
2024-12-02 13:47:51 -04:00
Joey Hess
31a38f8468
git-remote-annex: Require git version 2.31 or newer
Since old ones had a buggy git bundle command.

In particular, git 2.30.2 has a git bundle that supports --stdin, but does
not read from it, and so fails to create a bundle.

While not using --stdin would perhaps work, it limits the number of revs
that get included in the bundle to the command line length limit.

But the real kicker is that at the same time --stdin got fixed, a bug also
got fixed that made git bundle skip including refs when they had the same
sha as other refs it included. Which would lead to data loss. So best to
avoid that buggy thing.
2024-11-20 15:00:17 -04:00
Joey Hess
75b3f0eb75
fix build with old base
i386ancient has a base too old for NE.singleton
2024-09-30 11:02:08 -04:00
Joey Hess
4ca3d1d584
remove read of the heads
and one tail

Removed head from Utility.PartialPrelude in order to avoid the build
warning with recent ghc versions as well.
2024-09-26 18:43:59 -04:00
Joey Hess
43f31121a5
Git: use NonEmpty in fullconfig
This is a nice win. Avoids partial functions, by encoding at the type
level the fact that fullconfig is never an empty list.
2024-09-26 17:54:36 -04:00
Joey Hess
343c87db45
improve haddocks 2024-08-13 15:05:49 -04:00
Joey Hess
48657405c6
cache credentials for p2phttp in memory 2024-07-23 18:45:02 -04:00
Joey Hess
bbf261487d
add git-annex updatecluster command
Seems to work fine, making the right changes to the git-annex branch.
2024-06-14 15:02:01 -04:00
Joey Hess
0ffb0a4d25
dash is legal in git remote names 2024-06-12 13:24:31 -04:00
Joey Hess
317786d219
remove dead code 2024-06-10 14:28:58 -04:00
Joey Hess
b32c4c2e98
atomic git-annex branch update when regrafting in transition
Fix a bug where interrupting git-annex while it is updating the git-annex
branch could lead to git fsck complaining about missing tree objects.

Interrupting git-annex while regraftexports is running in a transition
that is forgetting git-annex branch history would leave the
repository with a git-annex branch that did not contain the tree shas
listed in export.log. That lets those trees be garbage collected.

A subsequent run of the same transition then regrafts the trees listed
in export.log into the git-annex branch. But those trees have been lost.

Note that both sides of `if neednewlocalbranch` are atomic now. I had
thought only the True side needed to be, but I do think there may be
cases where the False side needs to be as well.

Sponsored-by: Dartmouth College's OpenNeuro project
2024-06-07 16:34:10 -04:00
Joey Hess
04a256a0f8
work around git "defense in depth" breakage with git clone checking for hooks
This git bug also broke git-lfs, and I am confident it will be reverted
in the next release.

For now, cloning from an annex:: url wastes some bandwidth on the next
pull by not caching bundles locally.

If git doesn't fix this in the next version, I'd be tempted to rethink
whether bundle objects need to be cached locally. It would be possible to
instead remember which bundles have been seen and their heads, and
respond to the list command with the heads, and avoid unbundling them
agian in fetch. This might even be a useful performance improvement in
the latter case. It would be quite a complication to a currently simple
implementation though.
2024-05-24 15:49:53 -04:00
Joey Hess
0ba2b89c71
avoid displaying error from git symbolic-ref -q HEAD
Usually this won't fail even if .git/HEAD is not set or points to a ref
that doesn't exist. However, early in clone, it contains
"ref: refs/heads/.invalid" which causes an error "fatal: No such ref: HEAD"

When cloning from a special remote, git-remote-annex output that once
per bundle.
2024-05-24 13:49:18 -04:00
Joey Hess
13a6a20716
fix --is-ancestor option 2024-05-13 13:52:58 -04:00
Joey Hess
ff5193c6ad
Merge branch 'master' into git-remote-annex 2024-05-10 14:20:36 -04:00
Joey Hess
3039331529
git-remote-annex: incremental pushing
Untested

Sponsored-by: Joshua Antonishen on Patreon
2024-05-10 13:32:37 -04:00