git-annex

Author	SHA1	Message	Date
Joey Hess	51538fa0a8	improve error message when unable to get an input file In this case, the compute program is run the same as if addcomputed --fast were used, so it should succeed, without outputting a computed file. computeInputsUnavailable is in ComputeState for simplicity, but it is not serialized with the rest of the ComputeState.	2025-03-04 13:13:18 -04:00
Joey Hess	b395bd4f56	move showOutput into compute remote	2025-03-04 10:02:33 -04:00
Joey Hess	89bfeada87	recompute: display one of the changed files	2025-03-03 15:12:19 -04:00
Joey Hess	b01a0d2323	avoid recomputing every time on git inputs	2025-03-03 14:56:49 -04:00
Joey Hess	a0d6a6ea2a	support git files as input to computations Using GIT keys, like are used when exporting git files to special remotes. Except here the GIT key refers to a file checked into the git repo. Note that, since the compute remote uses catObject to get the content, a symlink that is checked into git does not get followed. This is important for security, because following a symlink and adding the content to the repo as an annex object would allow exfiltrating content from outside the repository. Instead, the behavior with a symlink is to run the computation on the symlink target. This may turn out to be confusing, and it might be worth addcomputed checking if the file in git is a symlink and erroring out. Or it could follow symlinks as long as the destination is a file in the repisitory.	2025-03-03 12:09:25 -04:00
Joey Hess	6ebab7fb00	factor out Annex.GitShaKey	2025-03-03 11:09:28 -04:00
Joey Hess	63d73d8d1b	record VURL key hashes in addcomputed and recompute	2025-03-03 10:57:56 -04:00
Joey Hess	b813549b2d	fix build	2025-02-27 16:18:04 -04:00
Joey Hess	e6ae5e8d56	many recompute improvements I've lost track of them all, but it includes: * Using the same key backend as was used in the original computation. * Fixing bug that prevented updating the source file key in the compute state * Handling --reproducible and --unreproducible. * recompute --original of a file using VURL, when the result is different, but the key remains the same, makes the object file be updated with the new content * Detecting some other ways the program behavior can change, just for completeness. * Also adds --backend to addcomputed.	2025-02-27 15:18:27 -04:00
Joey Hess	9c2c3002a6	fix recompute of renamed files When a computed file has been renamed, a recompute needs to write to the new filename. I decided to remove --others because it's not clear what it should do in the face of renames. Should it update only other files that have not been renamed? Or update files that use the old key to the new key anywhere in the tree? Or write the other files to the cwd, ignoring renames? Since --others is just a way to save on compute time, adding this complexity at this point seems like a bad idea. May revisit later. Added temporary TODO-compute file	2025-02-27 11:27:26 -04:00
Joey Hess	5d2a608a56	todo	2025-02-26 15:59:47 -04:00
Joey Hess	d6a010a615	recompute closer to working properly Proper behavior without --others implemented. And eliminated most of the code duplication through refactoring. Also, changed it to not stage recomputed files. This way, git diff will show files that have differences.	2025-02-26 15:52:52 -04:00
Joey Hess	53d107ca47	refactor	2025-02-26 14:05:37 -04:00
Joey Hess	3bec89a3c3	started git-annex recompute The perform action of this still needs work to do the right thing. In particular, it currently behaves as if --others was always set. And, it duplicates a lot of code from addcomputed.	2025-02-26 11:54:09 -04:00
Joey Hess	d49f371acc	showOutput when the compute program eg displays usage, it needs to start on its own line	2025-02-26 09:47:56 -04:00
Joey Hess	eed522a0f8	addcomputed inherits extra initremote parameters This is limited because the remote config is a field/value map. So order is not preserved, and when 2 parameters have the same field name, only the last one will be passed.	2025-02-26 09:45:35 -04:00
Joey Hess	a5b53fa98a	todo	2025-02-25 18:45:55 -04:00
Joey Hess	e702cb94ff	add compute remote uuid to compute state url Otherwise, two different compute remotes that happen to take the same input would use the same compute state url. Which seems wrong.	2025-02-25 18:44:40 -04:00
Joey Hess	71e92a509a	use compute program REPRODUCIBLE by default	2025-02-25 17:10:41 -04:00
Joey Hess	233a6954b9	ingest when --unreproducible is used without --fast	2025-02-25 17:04:19 -04:00
Joey Hess	16f529c05f	addcomputed --fast and --unreproducible working For these, use VURL and URL keys, with an "annex-compute:" URI prefix. These URL keys will look something like this: URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462 Generally it's too long so most of it gets md5summed. It's a little ugly, but it's what fell out of the existing URL key generation machinery. I did consider special casing to eg "URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at least possibly useful that the name of the file that was computed is visible and perhaps one or two words of the git-annex compute command parameters. Note that two different output files from the same computation will get the same URL key. And these keys should remain stable.	2025-02-25 16:43:15 -04:00
Joey Hess	a154e91513	add git-annex addcomputed Working pretty well. Mostly. But: * Does not yet support inputs that are non-annexed files checked into git * --fast is currently broken (will need something like VURL keys) * --unreproducible still uses a checksumming backend, so drop and get again will likely fail (needs probably to use an URL key or something like one) The compute special remote seems to work pretty well too. Eg, getting from it works, and dropping content that is present in it works.	2025-02-25 15:50:08 -04:00
Joey Hess	4f1eea9061	remove unused adjustedBranchRefresh associated file parameter	2025-02-21 14:51:02 -04:00
Joey Hess	f8bb9a8734	replace removeLink with removeFile same reasoning as in commit `5cc8d9d03b`	2025-02-11 13:41:26 -04:00
Joey Hess	3bbabd6778	replace R.doesPathExist with doesPathExist Equivilant, just avoids some ugliness.	2025-02-11 12:46:54 -04:00
Joey Hess	2ff716be30	OsPath build flag no longer depends on filepath-bytestring However, filepath-bytestring is still in Setup-Depends. That's because Utility.OsPath uses it when not built with OsPath. It would be maybe possible to make Utility.OsPath fall back to using filepath, and eliminate that dependency too, but it would mean either wrapping all of System.FilePath's functions, or using `type OsPath = FilePath` Annex.Import uses ifdefs to avoid converting back to FilePath when not on windows. On windows it's a bit slower due to that conversion. Utility.Path.Windows.convertToWindowsNativeNamespace got a bit slower too, but not really worth optimising I think. Note that importing Utility.FileSystemEncoding at the same time as System.Posix.ByteString will result in conflicting definitions for RawFilePath. filepath-bytestring avoids that by importing RawFilePath from System.Posix.ByteString, but that's not possible in Utility.FileSystemEncoding, since Setup-Depends does not include unix. This turned out not to affect any code in git-annex though. Sponsored-by: Leon Schuermann	2025-02-10 16:39:55 -04:00
Joey Hess	c730d00b6e	more OsPath conversion (749/749) Builds with and without OsPath build flag. Unfortunately, the test suite fails. Sponsored-by: unqueued on Patreon	2025-02-10 14:59:20 -04:00
Joey Hess	2d224e0d28	more OsPath conversion (658/749) At this point the test suite builds, and mostly the assistant is left. Sponsored-by: unqueued	2025-02-08 15:27:44 -04:00
Joey Hess	5eef09a3cc	more OsPath conversion (650/749) Sponsored-by: Nicholas Golder-Manning	2025-02-07 17:03:31 -04:00
Joey Hess	c74c75b352	more OsPath conversion (639/749) Sponsored-by: k0ld	2025-02-07 16:07:05 -04:00
Joey Hess	a5d48edd94	more OsPath conversion (602/749) Sponsored-by: Brock Spratlen	2025-02-07 14:46:11 -04:00
Joey Hess	2d1db7986c	more OsPath conversion (572/749) Sponsored-by: Jack Hill	2025-02-06 16:18:52 -04:00
Joey Hess	0811531b59	more OsPath conversion (542/749) Sponsored-by: Luke T. Shumaker	2025-02-06 11:38:14 -04:00
Joey Hess	77e9781ae2	parsePOSIXTime ByteString conversion Some easy (though tiny) speed wins. Sponsored-by: Luke T. Shumaker on Patreon	2025-01-22 16:42:09 -04:00
Joey Hess	6e27b0d4d1	convert from readFileStrict This removes that function, using file-io readFile' instead. Had to deal with newline conversion, which readFileStrict does on Windows. In a few cases, that was pretty ugly to deal with. Sponsored-by: Kevin Mueller	2025-01-22 16:20:36 -04:00
Joey Hess	9b79f0f43d	use file-io for readFile/writeFile/appendFile on ByteStrings These are all straightforward, and easy small performance wins. Sponsored-by: Nicholas Golder-Manning	2025-01-22 14:30:25 -04:00
Joey Hess	90cd3aad37	RawFilePath conversion for replaceFile Sponsored-by: Joshua Antonishen	2025-01-22 13:37:26 -04:00
Joey Hess	f17ec601c4	optimize truncateFilePath Often the filepath will be all ascii, or mostly so, and this optimisation makes a file that has an ascii suffix of sufficient length be roundtrip converted between String and ByteString only once, rather than once per character. Sponsored-by: Graham Spencer	2025-01-22 13:09:15 -04:00
Joey Hess	793ddecd4b	use openTempFile from file-io And follow-on changes. Note that relatedTemplate was changed to operate on a RawFilePath, and so when it counts the length, it is now the number of bytes, not the number of code points. This will just make it truncate shorter strings in some cases, the truncation is still unicode aware. When not building with the OsPath flag, toOsPath . fromRawFilePath and fromRawFilePath . fromOsPath do extra conversions back and forth between String and ByteString. That overhead could be avoided, but that's the non-optimised build mode, so didn't bother. Sponsored-by: unqueued	2025-01-22 11:41:43 -04:00
Joey Hess	1faa3af9cd	add file-io to build-depends when building with OsPath flag Partly converted code to use functions from it, though more remain unconverted. Most of withFile and openFile now use it.	2025-01-21 14:26:04 -04:00
Joey Hess	1ceece3108	RawFilePath conversion of System.Directory By using System.Directory.OsPath, which takes and returns OsString, which is a ShortByteString. So, things like dirContents currently have the overhead of copying that to a ByteString, but that should be less than the overhead of using Strings which often in turn were converted to RawFilePaths. Added Utility.OsString and the OsString build flag. That flag is turned on in the stack.yaml, and will be turned on automatically by cabal when built with new enough libraries. The stack.yaml change is a bit ugly, and that could be reverted for now if it causes any problems. Note that Utility.OsString.toOsString on windows is avoiding only a check of encoding that is documented as being unlikely to fail. I don't think it can fail in git-annex; if it could, git-annex didn't contain such an encoding check before, so at worst that should be a wash.	2025-01-20 19:17:33 -04:00
Joey Hess	9e4314de76	relax annex-tracking-branch to allow "/" Allow setting remote.foo.annex-tracking-branch to a branch name that contains "/", as long as it's not a remote tracking branch.	2025-01-20 11:31:18 -04:00
Joey Hess	5df1b2b36e	configs annex.post-update-command and annex.pre-commit-command Added git configs annex.post-update-command and annex.pre-commit-command that correspond to the git-annex hook scripts post-update-annex and pre-commit-annex. Note that the hook files take precience over the git config, since the git config can includ global config which should be overridden by local config. These new git configs are probably not super useful. Especially the pre-commit-annex hook is there to install scripts to instead of the pre-commit hook, since git-annex installs that hook itself. So why would someone want to use a git config for that? Only reason I can think of would be in a global git config. Or possibly because it's easier to set a git config than write a hook script, on an OS like Windows. The real reason I'm adding these is as groundwork for making other annex.-command git configs also be available as hook scripts. I want to avoid having some things available as only git hooks and others as both gitconfigs and git hooks. (It seems that some annex.-command configs don't translate to git hooks though.) In the man page, moved documentation of the hooks to be next to the documentation of the git configs. This is to avoid repitition.	2025-01-10 13:27:51 -04:00
Joey Hess	0815c82bb1	log: Support --key, as well as --branch and --unused --all remains a special case, since it is more efficient and displays in a nicer order. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-01-03 15:45:42 -04:00
Joey Hess	da5e195597	remove i386ancient and need at least debian stable to build * Removed the i386ancient standalone tarball build for linux, which was increasingly unable to support new git-annex features. * Removed support for building with ghc older than 9.0.2, and with older versions of haskell libraries than are in current Debian stable. * stack.yaml: Update to lts-23.2. Note that i386ancient was targeting linux 2.6.32, which has been EOL for over 9 years now. Any old system still using such a kernel is certainly highly insecure. And I suspect i386ancient had its own insecurities due to haskell libraries and C libraries not having been updated.	2025-01-01 14:15:55 -04:00
Joey Hess	29b3c7c660	annex.addunlocked support for tree imports Honor annex.addunlocked configuration when importing a tree from a special remote. Note, in a --no-content import, the object file will not be populated (usually) and so expressions that match on mime type will not match. Tested this and it works ok, the file just ends up locked. Updated docs for the mime expressions to mention that they can't match when the file is present Note that in Command.Sync.pullThirdPartyPopulated, recordImportTree is called without a AddUnlockedMatcher. Since the tree generated here is not exposed to the user and does not contain usual filenames, there is no need of the overhead of checking it.	2024-12-19 11:43:51 -04:00
Joey Hess	7d8558548b	empty preferred content * Document that settting preferred content to "" is the same as the default unset behavior. * sync: Avoid misleading warning about future preferred content transition when preferred content is set to "".	2024-12-13 13:26:48 -04:00
Joey Hess	4c785c338a	p2phttp: notice when new repositories are added to --directory When a uuid is not known, rescan for new repositories. Easy. When a repository is removed, it will also get removed from the server state on the next scan. But until a new uuid is seen, there will not be a scan. This leaves the server trying to serve a uuid whose repository is gone. That seems buggy. While getting just fails, dropping fails the first time, but seems to leave the server in an unusable state, so the next drop attempt hangs. The server is still able to serve other uuids, only the one whose repository was removed has that problem.	2024-11-21 15:09:12 -04:00
Joey Hess	758ea89c74	skip over repositories in --directory that do not have annex.uuid set	2024-11-21 14:18:18 -04:00
Joey Hess	3c18398d5a	p2phttp support --jobs with --directory --jobs is usually an Annex option setter, but --directory runs in IO, so would not have that available. So instead moved the option parser into the command's Options.	2024-11-21 14:15:14 -04:00

1 2 3 4 5 ...

3026 commits