git-annex

Author	SHA1	Message	Date
Joey Hess	73060eea51	annex.fastcopy Added annex.fastcopy and remote.name.annex-fastcopy config setting. When set, this allows the copy_file_range syscall to be used, which can eg allow for server-side copies on NFS. (For fastest copying, also disable annex.verify or remote.name.annex-verify.) This is a simple implementation, that does not handle resuming as well as it possibly could. It can be used with both local git remotes (including on NFS), and directory special remotes. Other types of remotes could in theory also support it, so I've left the config documented as a general thing.	2025-06-03 15:01:38 -04:00
Joey Hess	d416107c7d	improve comment	2025-04-11 11:17:09 -04:00
Joey Hess	1313cc4d60	mask remotes, partial implementation Everything implemented except for passing through to the masked remote. Which should be trivial.	2025-04-10 13:10:07 -04:00
Joey Hess	e81fd72018	Added remote.name.annex-web-options config Which is a per-remote version of the annex.web-options config. Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a special remote does not use curl, there was no need to do that and I used Nothing instead. In the case of the addurl and importfeed commands, it seemed best to say that running these commands is not using the web special remote per se, so the config is not used for those commands.	2025-04-01 10:17:38 -04:00
Joey Hess	52f51d065a	rename config to annex.security.allowed-compute-programs And require for enable as well as autoenable. It seemed asking for trouble for `git-annex enable foo` to use whatever compute program is stored in the git config, without verifying that the user wants that program to be used. Note that it would be good to allow `git-annex enable foo program=...` to be used without the program being in the git config. Not implemented yet though.	2025-03-03 16:12:03 -04:00
Joey Hess	f32d2aecce	autoenable security for compute special remote Added annex.security.autoenable-compute-programs and only allow autoenabling special remotes that use compute programs on that list. The reason this is needed is a user might have some compute programs that are less safe to use than others. They might want to use an unsafe one only with one repository, where they are the only committer or other committers are trusted. They might be ok with others being used by any repository, and if so they can add them to the list. Another reason would be a user who has installed a compute program by accident. Eg, it might be included with git-annex at some point, or pulled in by some dependency. That user doesn't necessarily want that compute program to be used in an autoenabled special remote.	2025-03-03 15:52:56 -04:00
Joey Hess	2ff716be30	OsPath build flag no longer depends on filepath-bytestring However, filepath-bytestring is still in Setup-Depends. That's because Utility.OsPath uses it when not built with OsPath. It would be maybe possible to make Utility.OsPath fall back to using filepath, and eliminate that dependency too, but it would mean either wrapping all of System.FilePath's functions, or using `type OsPath = FilePath` Annex.Import uses ifdefs to avoid converting back to FilePath when not on windows. On windows it's a bit slower due to that conversion. Utility.Path.Windows.convertToWindowsNativeNamespace got a bit slower too, but not really worth optimising I think. Note that importing Utility.FileSystemEncoding at the same time as System.Posix.ByteString will result in conflicting definitions for RawFilePath. filepath-bytestring avoids that by importing RawFilePath from System.Posix.ByteString, but that's not possible in Utility.FileSystemEncoding, since Setup-Depends does not include unix. This turned out not to affect any code in git-annex though. Sponsored-by: Leon Schuermann	2025-02-10 16:39:55 -04:00
Joey Hess	5eef09a3cc	more OsPath conversion (650/749) Sponsored-by: Nicholas Golder-Manning	2025-02-07 17:03:31 -04:00
Joey Hess	a5d48edd94	more OsPath conversion (602/749) Sponsored-by: Brock Spratlen	2025-02-07 14:46:11 -04:00
Joey Hess	0d2b805806	more OsPath conversion (520/749) Sponsored-by: mycroft	2025-02-05 15:07:59 -04:00
Joey Hess	4dc904bbad	more OsPath conversion Sponsored-by: Leon Schuermann	2025-02-04 16:09:47 -04:00
Joey Hess	54f0710fd2	more OsPath conversion (464/749) Sponsored-by: unqueued	2025-02-04 13:35:17 -04:00
Joey Hess	71195cce13	more OsPath conversion Sponsored-by: k0ld	2025-02-01 14:06:38 -04:00
Joey Hess	474cf3bc8b	more OsPath conversion Sponsored-by: Brock Spratlen	2025-02-01 11:54:19 -04:00
Joey Hess	c69e57aede	more OsPath conversion Sponsored-by: Jack Hill	2025-01-30 15:46:32 -04:00
Joey Hess	27305042f3	more OsPath conversion Sponsored-by: Nicholas Golder-Manning	2025-01-29 11:53:20 -04:00
Joey Hess	0376bc5ee0	more OsPath conversion Sponsored-by: Luke T. Shumaker	2025-01-28 16:31:19 -04:00
Joey Hess	22c2451e26	more OsPath conversion Sponsored-by: mycroft	2025-01-28 15:46:00 -04:00
Joey Hess	917c43f31f	Merge /home/joey/tmp/git-annex into ospath	2025-01-28 15:29:58 -04:00
Joey Hess	87cda29dd7	remove Read instance for AssociatedFile This instance is not used.	2025-01-28 15:29:25 -04:00
Joey Hess	7ebef6cd1b	more OsPath conversion keyFile has a nice improvement; since a Key is a ShortByteString, it can be converted to an OsPath without needing the copy that was done before. Unfortunately, fileKey has to convert from a ShortByteString to a ByteString in order to use attoparsec, and then the results get converted back to an OsPath, so there are now 2 copies. Maybe attoparsec will eventually get a ShortByteString API, see https://github.com/haskell/attoparsec/issues/225 Sponsored-by: Joshua Antonishen	2025-01-27 16:55:07 -04:00
Joey Hess	8bafe05500	more OsPath conversion	2025-01-27 10:13:43 -04:00
Joey Hess	5bca78b813	OsPath conversion Decent win in exportDirectories, since it operates on ShortByteString end to end now without needing conversion. That made it worth implementing an OsPath specific code path there. And ExportLocation already being a ShortByteString is an good example of why it's a good thing that OsPath uses that! Sponsored-by: k0ld on Patreon	2025-01-25 11:53:47 -04:00
Joey Hess	6e27b0d4d1	convert from readFileStrict This removes that function, using file-io readFile' instead. Had to deal with newline conversion, which readFileStrict does on Windows. In a few cases, that was pretty ugly to deal with. Sponsored-by: Kevin Mueller	2025-01-22 16:20:36 -04:00
Joey Hess	1ceece3108	RawFilePath conversion of System.Directory By using System.Directory.OsPath, which takes and returns OsString, which is a ShortByteString. So, things like dirContents currently have the overhead of copying that to a ByteString, but that should be less than the overhead of using Strings which often in turn were converted to RawFilePaths. Added Utility.OsString and the OsString build flag. That flag is turned on in the stack.yaml, and will be turned on automatically by cabal when built with new enough libraries. The stack.yaml change is a bit ugly, and that could be reverted for now if it causes any problems. Note that Utility.OsString.toOsString on windows is avoiding only a check of encoding that is documented as being unlikely to fail. I don't think it can fail in git-annex; if it could, git-annex didn't contain such an encoding check before, so at worst that should be a wash.	2025-01-20 19:17:33 -04:00
Joey Hess	42d55bc57c	pre-init config and hook Added annex.pre-init-command git config and pre-init-annex hook that is run before git-annex repository initialization. This can block initialization. Or it can preform pre-initialization configuration or tweaking. I left stdio connected while it's running, so it could also be used for interactive prompting conceivably, although that would want to use /dev/tty anyway probably in order to not pollute the stdout of a command when automatic initialization is done. Sponsored-by: Dartmouth College's OpenNeuro project	2025-01-13 14:22:49 -04:00
Joey Hess	5df1b2b36e	configs annex.post-update-command and annex.pre-commit-command Added git configs annex.post-update-command and annex.pre-commit-command that correspond to the git-annex hook scripts post-update-annex and pre-commit-annex. Note that the hook files take precience over the git config, since the git config can includ global config which should be overridden by local config. These new git configs are probably not super useful. Especially the pre-commit-annex hook is there to install scripts to instead of the pre-commit hook, since git-annex installs that hook itself. So why would someone want to use a git config for that? Only reason I can think of would be in a global git config. Or possibly because it's easier to set a git config than write a hook script, on an OS like Windows. The real reason I'm adding these is as groundwork for making other annex.-command git configs also be available as hook scripts. I want to avoid having some things available as only git hooks and others as both gitconfigs and git hooks. (It seems that some annex.-command configs don't translate to git hooks though.) In the man page, moved documentation of the hooks to be next to the documentation of the git configs. This is to avoid repitition.	2025-01-10 13:27:51 -04:00
Joey Hess	dd052dcba1	annexInsteadOf config Added config `url.<base>.annexInsteadOf` corresponding to git's `url.<base>.pushInsteadOf`, to configure the urls to use for accessing the git-annex repositories on a server without needing to configure remote.name.annexUrl in each repository. While one use case for this would be rewriting urls to use annex+http, I decided not to add any kind of special case for that. So while git-annex p2phttp, when serving multiple repositories, needs an url of eg "annex+http://example.com/git-annex/ for each of them, rewriting an url like "https://example.com/git/foo/bar" with this config set to "https://example.com/git/" will result in eg "annex+http://example.com/git-annex/foo/bar", which p2phttp does not support. That seems better dealt with in either git-annex p2phttp or a http middleware, rather than complicating the config with a special case for annex+http. Anyway, there are other use cases for this that don't involve annex+http.	2024-12-03 14:39:07 -04:00
Joey Hess	8baccda98f	Merge branch 'master' into streamproxy	2024-10-22 09:49:28 -04:00
Joey Hess	2c14181bcb	better name for LinkPresentAdjustment	2024-10-21 15:42:01 -04:00
Joey Hess	82e91b380a	add GITMANIFEST to parseKeyVariety git-remote-annex: Fix bug that prevented using it with external special remotes, leading to protocol error messages involving "GITMANIFEST".	2024-10-19 17:12:23 -04:00
Joey Hess	d9b4bf4224	added retrieveKeyFileInOrder and ORDERED to external special remote protocol I anticipate lots of external special remote programs will neglect implementing this. Still, it's the right thing to do to assume that some of them may write files out of order. Probably most external special remotes will not be used with a proxy. When someone is using one with a proxy, they can always get it fixed to send ORDERED.	2024-10-15 15:40:14 -04:00
Joey Hess	835283b862	stream through proxy when using fileRetriever The problem was that when the proxy requests a key be retrieved to its own temp file, fileRetriever was retriving it to the key's temp location, and then moving it at the end, which broke streaming. So, plumb through the path where the key is being retrieved to.	2024-10-15 14:29:06 -04:00
Joey Hess	783e910d0c	sim: Add metadata command Only really needed for completeness, preferred content expressions can match against metadata.	2024-09-26 12:20:37 -04:00
Joey Hess	6cf9a101b8	sim: Fix size tracking for balanced preferred content	2024-09-23 12:42:32 -04:00
Joey Hess	52891711d2	git-annex sim command is working Had to add Read instances to Key and NumCopies and some other similar types. I only expect to use those in serializing a sim. Of course, this risks that implementation changes break reading old data. For a sim, that would not be a big problem.	2024-09-12 16:10:52 -04:00
Joey Hess	7e8274c6b7	implemented ActionDropUnwanted Not tested yet. This emulates the same checking that is done when dropping. Note that when dropping from a special remote it is not able to make a locked copy.	2024-09-12 10:44:31 -04:00
Joey Hess	4e11cb19ef	implemented cloneSimRepo Started on updateSimRepoState	2024-09-06 14:23:29 -04:00
Joey Hess	5807e1480c	correct comment This is not related to v5 versus newer versions.	2024-09-03 14:23:32 -04:00
Joey Hess	340bdd0dac	treat "not present" in preferred content as invalid Detect when a preferred content expression contains "not present", which would lead to repeatedly getting and then dropping files, and make it never match. This also applies to "not balanced" and "not sizebalanced". --explain will tell the user when this happens Note that getMatcher calls matchMrun' and does not check for unstable negated limits. While there is no --present anyway, if there was, it would not make sense for --not --present to complain about instability and fail to match.	2024-09-03 13:50:06 -04:00
Joey Hess	35ff8c8c00	use Utility.PID fixes build on i386ancient	2024-08-30 14:56:38 -04:00
Joey Hess	f89a1b8216	remove stale live changes from reposize database Reorganized the reposize database directory, and split up a column. checkStaleSizeChanges needs to run before needLiveUpdate, otherwise the process won't be holding a lock on its pid file, and another process could go in and expire the live update it records. It just so happens that they do get called in the correct order, since checking balanced preferred content calls getLiveRepoSizes before needLiveUpdate. The 1 minute delay between checks is arbitrary, but will avoid excess work. The downside of it is that, if a process is dropping a file and gets interrupted, for 1 minute another process can expect a repository will soon be smaller than it is. And so a process might send data to a repository when a file is not really going to be dropped from it. But note that can already happen if a drop takes some time in eg locking and then fails. So it seems possible that live updates should only be allowed to increase, rather than decrease the size of a repository.	2024-08-28 13:57:25 -04:00
Joey Hess	e006acef22	avoid reposize database locking overhead when not needed Only when the preferred content expression being matched uses balanced preferred content is this overhead needed. It might be possible to eliminate the locking entirely. Eg, check the live changes before and after the action and re-run if they are not stable. For now, this is good enough, it avoids existing preferred content getting slow. If balanced preferred content turns out to be too slow to check, that could be tried later.	2024-08-28 10:52:34 -04:00
Joey Hess	4d2f95853d	closing in on finishing live reposizes Fixed successfullyFinishedLiveSizeChange to not update the rolling total when a redundant change is in RecentChanges. Made setRepoSizes clear RecentChanges that are no longer needed. It might be possible to clear those earlier, this is only a convenient point to do it. The reason it's safe to clear RecentChanges here is that, in order for a live update to call successfullyFinishedLiveSizeChange, a change must be made to a location log. If a RecentChange gets cleared, and just after that a new live update is started, making the same change, the location log has already been changed (since the RecentChange exists), and so when the live update succeeds, it won't call successfullyFinishedLiveSizeChange. The reason it doesn't clear RecentChanges when there is a reduntant live update is because I didn't want to think through whether or not all races are avoided in that case. The rolling total in SizeChanges is never cleared. Instead, calcJournalledRepoSizes gets the initial value of it, and then getLiveRepoSizes subtracts that initial value from the current value. Since the rolling total can only be updated by updateRepoSize, which is called with the journal locked, locking the journal in calcJournalledRepoSizes ensures that the database does not change while reading the journal.	2024-08-27 12:54:46 -04:00
Joey Hess	d7813876a0	fixed the build Manually tested getLiveRepoSizes and it is working correctly.	2024-08-27 09:41:35 -04:00
Joey Hess	521e0a7062	fix a deadlock When finishedLiveUpdate was run on a different key than expected, it blocked forever waiting for an indication the database had been updated. Since the journal is locked when finishedLiveUpdate runs, this could also have caused other git-annex commands to hang.	2024-08-27 00:13:54 -04:00
Joey Hess	18f8d61f55	rolling total of size changes in RepoSize database When a live size change completes successfully, the same transaction that removes it from the database updates the rolling total for its repository. The idea is that when RepoSizes is read, SizeChanges will be as well, and cached locally. Any time a change is made, the local cache will be updated. So by comparing the local cache with the current SizeChanges, it can learn about size changes that were made by other processes. Then read the LiveSizeChanges, and add that in to get a live picture of the current sizes. Also added a SizeChangeId. This allows 2 different threads, or processes, to both record a live size change for the same repo and key, and update their own information without stepping on one-another's toes.	2024-08-25 10:34:47 -04:00
Joey Hess	d60a33fd13	improve live update starting In an expression like "balanced=foo and exclude=bar", avoid it starting a live update when the overall expression doesn't match.	2024-08-24 13:07:05 -04:00
Joey Hess	2f20b939b7	LiveUpdate db updates working I've tested the behavior of the thread that waits for the LiveUpdate to be finished, and it does get signaled and exit cleanly when the LiveUpdate is GCed instead. Made finishedLiveUpdate wait for the thread to finish updating the database. There is a case where GC doesn't happen in time and the database is left with a live update recorded in it. This should not be a problem as such stale data can also happen when interrupted and will need to be detected when loading the database. Balanced preferred content expressions now call startLiveUpdate.	2024-08-24 11:49:58 -04:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00

1 2 3 4 5 ...

834 commits