git-annex

Author	SHA1	Message	Date
Joey Hess	2f20b939b7	LiveUpdate db updates working I've tested the behavior of the thread that waits for the LiveUpdate to be finished, and it does get signaled and exit cleanly when the LiveUpdate is GCed instead. Made finishedLiveUpdate wait for the thread to finish updating the database. There is a case where GC doesn't happen in time and the database is left with a live update recorded in it. This should not be a problem as such stale data can also happen when interrupted and will need to be detected when loading the database. Balanced preferred content expressions now call startLiveUpdate.	2024-08-24 11:49:58 -04:00
Joey Hess	84d1bb746b	LiveUpdate for clusters	2024-08-24 10:20:12 -04:00
Joey Hess	18cd8bf43a	punt on LiveUpdate plumbing through assistant for now	2024-08-24 09:37:24 -04:00
yarikoptic	efdee386c0	initial report on desire to do handle pathspecs	2024-08-24 01:35:31 +00:00
yarikoptic	c3877f648c	initial idea on another ability for get	2024-08-24 01:23:04 +00:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	4885073377	add live size changes to RepoSize database Not yet used.	2024-08-23 12:51:00 -04:00
Joey Hess	dad1fb150f	update	2024-08-23 11:45:36 -04:00
Joey Hess	d0ab1550ec	possible design to address reposizes concurrency issues	2024-08-23 11:19:38 -04:00
gauss@055c9051f507c97fa5612f46c74ce636f5ecde10	d71ca87bc9	Added a comment: No root privileges server - annex-shell replaced by git-annex-shell	2024-08-23 01:51:49 +00:00
Joey Hess	8ade3fc5d6	improve docs	2024-08-22 08:09:10 -04:00
Joey Hess	abdd49d8c1	update	2024-08-22 07:53:56 -04:00
Joey Hess	173500872f	update	2024-08-22 07:17:04 -04:00
Joey Hess	70e2fca257	Added the annex.fullybalancedthreshhold git config.	2024-08-22 07:15:55 -04:00
Joey Hess	3fe67744b1	display new empty repos in maxsize table A new repo that has no location log info yet, but has an entry in uuid.log has 0 size, so make RepoSize aware of that. Note that a new repo that does not yet appear in uuid.log will still not be displayed. When a remote is added but not synced with yet, it has no uuid.log entry. If git-annex maxsize is used to configure that remote, it needs to appear in the maxsize table, and the change to Command.MaxSize takes care of that.	2024-08-22 07:03:22 -04:00
Spencer	acaa8e9cd5	Added a comment: Precise Workflow	2024-08-22 00:18:28 +00:00
Joey Hess	76ece2a699	make --rebalance of balanced use fullysizebalanced when useful When the specified number of copies is > 1, and some repositories are too full, it can be better to move content from them to other less full repositories, in order to make space for new content. annex.fullybalancedthreshhold is documented, but not implemented yet This is not tested very well yet, and is known to sometimes take several runs to stabalize.	2024-08-21 17:59:08 -04:00
Joey Hess	9e87061de2	Support "sizebalanced=" and "fullysizebalanced=" too Might want to make --rebalance turn balanced=group:N where N > 1 to fullysizebalanced=group:N. Have not yet determined if that will improve situations enough to be worth the extra work.	2024-08-21 15:01:54 -04:00
Joey Hess	4e1dcc0372	bug	2024-08-21 12:18:31 -04:00
Joey Hess	476d223bce	implement fullbalanced=group:N Rebalancing this when it gets into a suboptimal situation will need further work.	2024-08-20 13:51:02 -04:00
Matthew	4a9e637d36	Added a comment: Help with .nfsXXXX files	2024-08-19 21:20:59 +00:00
matrss	9cfdae4c3b	Added a comment	2024-08-19 10:25:13 +00:00
Joey Hess	68a99a8f48	size based rebalancing design	2024-08-18 16:25:12 -04:00
Joey Hess	99514f9d18	maxsize overview display and --json support	2024-08-18 12:08:13 -04:00
xentac	74b953cded	Added a comment	2024-08-18 03:17:12 +00:00
Joey Hess	f985c58d8e	consistently don't show sizes of empty repositories This used to be the case, and when matching options are used, that code path still omits them, so also omit them in the getRepoSize code path.	2024-08-17 15:09:16 -04:00
Joey Hess	b62b58b50b	git-annex info speed up using getRepoSizes	2024-08-17 14:54:31 -04:00
Joey Hess	d09a005f2b	update RepoSize database from git-annex branch incrementally The use of catObjectStream is optimally fast. Although it might be possible to combine this with git-annex branch merge to avoid some redundant work. Benchmarking, a git-annex branch that had 100000 files changed took less than 1.88 seconds to run through this.	2024-08-17 13:35:00 -04:00
Spencer	40b49e2ddd	Added a comment: Remote Helper?	2024-08-17 05:33:01 +00:00
matrss	bcf876e3a0		2024-08-16 15:52:32 +00:00
matrss	f057010086	Added a comment	2024-08-16 15:45:45 +00:00
Joey Hess	61d95627f3	fix Annex.repoSize sharing between threads	2024-08-16 10:56:51 -04:00
Joey Hess	e361b9ea3c	todo	2024-08-15 16:15:48 -04:00
Joey Hess	63ccf6ffa7	todo	2024-08-15 13:50:50 -04:00
Joey Hess	4a0c7e2b2c	update	2024-08-15 13:41:47 -04:00
Joey Hess	a2da9c526b	RepoSize concurrency fix When loading the journalled repo sizes, make sure that the current process is prevented from making changes to the journal in another thread.	2024-08-15 13:37:41 -04:00
Joey Hess	06064f897c	update Annex.reposizes when changing location logs The live update is only needed when Annex.reposizes has already been populated.	2024-08-15 13:27:14 -04:00
Joey Hess	c376b1bd7e	show message when doing possibly expensive from scratch reposize calculation	2024-08-15 12:42:36 -04:00
Joey Hess	c200523bac	implement getRepoSizes At this point the RepoSize database is getting populated, and it all seems to be working correctly. Incremental updates still need to be done to make it performant.	2024-08-15 12:31:56 -04:00
Joey Hess	eac4e9391b	finalize RepoSize database Including locking on creation, handling of permissions errors, and setting repo sizes. I'm confident that locking is not needed while using this database. Since writes happen in a single transaction. When there are two writers that are recording sizes based on different git-annex branch commits, one will overwrite what the other one recorded. Which is fine, it's only necessary that the database stays consistent with the content of a git-annex branch commit.	2024-08-15 12:29:34 -04:00
Atemu	e8997d8899	Added a comment	2024-08-15 15:40:20 +00:00
Joey Hess	3e6eb2a58d	implement journalledRepoSizes Plan is to run this when populating Annex.reposizes on demand. So Annex.reposizes will be up-to-date with the journal, including crucially journal entries for private repositories. But also anything that has been written to the journal by another process, especially if the process was ran with annex.alwayscommit=false. From there, Annex.reposizes can be kept up to date with changes made by the running process.	2024-08-14 13:53:24 -04:00
pedro-lopes-de-azevedo	c75ecc5350	Added a comment: parameter --from not accepted	2024-08-14 14:27:54 +00:00
bvaa	11eb2ae6ec	Added a comment	2024-08-14 07:18:26 +00:00
Joey Hess	90a79a6c1e	plan	2024-08-13 15:13:30 -04:00
Joey Hess	a979d8da41	update	2024-08-13 14:14:47 -04:00
Joey Hess	10d8b3cc63	fixed --rebalance stability on drop Was checking the wrong uuid, oops	2024-08-13 13:32:11 -04:00
Joey Hess	745bc5c547	take maxsize into account for balanced preferred content This is very innefficient, it will need to be optimised not to calculate the sizes of repos every time. Also, fixed a bug in balancedPicker that caused it to pick a too high index when some repos were excluded due to being full.	2024-08-13 11:00:20 -04:00
Spencer	05a62e4e5f	Added a comment: Workaround: --force-small	2024-08-13 07:05:57 +00:00
Spencer	3d252da06c	Added a comment: Exact Moment Things Go Wrong	2024-08-13 06:22:11 +00:00
Spencer	ab5f920d77	.md linting	2024-08-13 04:46:53 +00:00
Spencer	8a91a8c208		2024-08-13 04:46:10 +00:00
Spencer	c4296fbd45	Added a comment: Still a Problem (on Mac?)	2024-08-13 04:21:33 +00:00
ewen	491cf67ce2	Added a comment: Most servers upgraded to TLS v1.2 EMS / TLS v1.3	2024-08-13 00:01:05 +00:00
Joey Hess	b201792391	update	2024-08-12 18:57:03 -04:00
Joey Hess	1e799e7842	update	2024-08-12 11:56:52 -04:00
Joey Hess	71043fe9f7	update	2024-08-12 10:01:48 -04:00
Joey Hess	bcd2b9a5c4	idea	2024-08-12 09:43:14 -04:00
Joey Hess	1265d7e5df	implement maxsize log and command * maxsize: New command to tell git-annex how large the expected maximum size of a repository is. * vicfg: Include maxsize configuration.	2024-08-11 15:41:26 -04:00
Joey Hess	3019b21c40	more formal documentation of balancing	2024-08-11 13:29:06 -04:00
Joey Hess	bd5affa362	use hmac in balanced preferred content This deals with the possible security problem that someone could make an unusually low UUID and generate keys that are all constructed to hash to a number that, mod the number of repositories in the group, == 0. So balanced preferred content would always put those keys in the repository with the low UUID as long as the group contains the number of repositories that the attacker anticipated. Presumably the attacker than holds the data for ransom? Dunno. Anyway, the partial solution is to use HMAC (sha256) with all the UUIDs combined together as the "secret", and the key as the "message". Now any change in the set of UUIDs in a group will invalidate the attacker's constructed keys from hashing to anything in particular. Given that there are plenty of other things someone can do if they can write to the repository -- including modifying preferred content so only their repository wants files, and numcopies so other repositories drom them -- this seems like safeguard enough. Note that, in balancedPicker, combineduuids is memoized.	2024-08-10 16:32:54 -04:00
Joey Hess	bde58e6c71	todo	2024-08-09 16:57:10 -04:00
Joey Hess	412f6057e4	todo	2024-08-09 16:47:28 -04:00
xentac	fb186ab0a8	Added a comment	2024-08-09 19:31:12 +00:00
xentac	55a5cb7904		2024-08-09 19:22:19 +00:00
Joey Hess	f1cb5cb908	wrote git-annex maxsize man page	2024-08-09 14:57:11 -04:00
Joey Hess	5a6afff3d6	left off number option	2024-08-09 14:22:05 -04:00
Joey Hess	3ce2e95a5f	balanced preferred content and --rebalance This all works fine. But it doesn't check repository sizes yet, and without repository size checking, once a repository gets full, there will be no other repository that will want its files. Use of sha2 seems unncessary, probably alder2 or md5 or crc would have been enough. Possibly just summing up the bytes of the key mod the number of repositories would have sufficed. But sha2 is there, and probably hardware accellerated. I doubt very much there is any security benefit to using it though. If someone wants to construct a key that will be balanced onto a given repository, sha2 is certianly not going to stop them.	2024-08-09 14:16:09 -04:00
Joey Hess	152c87140b	update	2024-08-08 16:06:02 -04:00
Joey Hess	0959bfe5d3	update for exporttree=yes	2024-08-08 15:51:36 -04:00
Joey Hess	727b6a0b6d	update	2024-08-08 15:34:36 -04:00
Joey Hess	2616056cde	Merge branch 'exportreeplus'	2024-08-08 15:31:57 -04:00
Joey Hess	3b758aaad6	add news item for git-annex 10.20240808	2024-08-08 15:27:11 -04:00
Joey Hess	3ea835c7e8	proxied exporttree=yes versionedexport=yes remotes are not untrusted This removes versionedExport, which was only used by the S3 special remote. Instead, versionedexport=yes is a common way for remotes to indicate that they are versioned.	2024-08-08 15:24:19 -04:00
Joey Hess	5c36177e58	proxied exporttree=yes remotes are untrustworthy This is not perfect because it does not handle versioned special remotes, which should not be untrustworthy, but now are when proxied. The implementation turned out to be easy, because the exporttree field is a default field, so is available in RemoteConfig even for git remotes.	2024-08-08 14:43:53 -04:00
Joey Hess	b23c7f769e	update	2024-08-08 14:25:18 -04:00
Joey Hess	9663888c77	update	2024-08-08 14:05:05 -04:00
Joey Hess	a2eb3b450a	post-receive: use the exporttree=yes remote as a source This handles cases where a single key is used by multiple files in the exported tree. When using `git-annex push`, the key's content gets stored in the annexobjects location, and then when the branch is pushed, it gets renamed from the annexobjects location to the first exported file. For subsequent exported files, a copy of the content needs to be made. This causes it to download the key from the remote in order to upload another copy to it. This is not needed when using `git push` followed by `git-annex copy --to` the proxied remote, because the received key is stored at all export locations then. Also, fixed handling of the synced branch push, it was exporting master when synced/master was pushed. Note that currently, the first push to the remote does not see that it is able to get a key from it in order to upload it back. It displays "(not available)". The second push is able to. Since git-annex push pushes first the synced branch and then the branch, this does end up with a full export being made, but it is not quite right.	2024-08-08 13:49:53 -04:00
Joey Hess	7294d23d78	export: Added --from option This is similar to git-annex copy --from --to, in that it downloads a local copy, locks it for removal, uploads it, and drops it. Removal of the temporary local copy is done without verifying numcopies for the same reason as that command. I do wonder, looking at this, if there's a race where the local copy gets used as a copy to allow some other drop in the narrow window after it is downloaded and before it gets locked for removal. That would need some other repository to have an out of date location log that says the repository contains a copy of the key, in order for it to try to use it as a copy. If there is such a race, git-annex copy/move would also be vulnerable to it. It would be better to lock it for removal before starting to download it! That is possible in v10 repositories, which do use a separate content lock file. Note that, when the exported tree contains several files that use the same key, it will be downloaded repeatedly, once per time needed to upload it. It would be possible to avoid that extra work, but it would complicate this since the local copy would need to be preserved, locked for removal, until the end. Also, that would mean that interrupting the export would leave possibly a lot of temporarily downloaded keys in the local repository, while currently it can only leave one.	2024-08-08 12:08:55 -04:00
Joey Hess	01edd186e9	update proxied exporttree=yes remote on receive of sync branch Since git-annex sync sends the sync branch first, and only displays the output of the push to the sync branch, this makes git-annex post-retrieve's output when updating the exported tree be visible when syncing. This also makes syncing with a non-bare repository still update the exported tree, even when the checked out branch is not able to be updated. The sync branch gets sent regardless.	2024-08-07 13:11:06 -04:00
Joey Hess	55adbb6694	avoid trying to export tree to proxied exporttree=yes remotes This avoids a lot of ugly messages when syncing with such a remote. The export tree happens on the proxy side.	2024-08-07 13:00:19 -04:00
Joey Hess	6d96734128	updateproxy, updatecluster check annexobjects=yes updateproxy, updatecluster: Prevent using an exporttree=yes special remote that does not have annexobjects=yes, since it will not work.	2024-08-07 12:27:24 -04:00
Joey Hess	8864a9e353	update	2024-08-07 11:49:53 -04:00
Joey Hess	1e0f13ad7f	comment	2024-08-07 11:39:29 -04:00
Joey Hess	b8f8c38e88	Merge branch 'master' into exportreeplus	2024-08-07 11:28:21 -04:00
Joey Hess	509b23fa00	catch ClientError from withClientM When getting from a P2P HTTP remote, prompt for credentials when required, instead of failing. This feels like it might be a bug in servant-client. withClientM's type suggests it would not throw a ClientError. But it does in this case.	2024-08-07 11:24:34 -04:00
Joey Hess	43e1f590c9	comment	2024-08-07 10:47:47 -04:00
Joey Hess	1038567881	proxy stores received keys to known export locations This handles the workflow where the branch is first pushed to the proxy, and then files in the exported tree are later are copied to the proxied remote. Turns out that the way the export log is structured, nothing needs to be done to finalize the export once the last key is sent to it. Which is great because that would have been a lot of complication. On receiving the push, Command.Export runs and calls recordExportBeginning, does as much as it can to update the export with the files currently on it, and then calls recordExportUnderway. At that point, the export.log records the export as "complete", but it's not really. And that's fine. The same happens when using `git-annex export` when some files are not available to send. Other repositories that have access to the special remote can already retrieve files from it. As the missing files get copied to the exported remote, all that needs to be done is record each in the export db. At this point, proxying to exporttree=yes annexobjects=yes special remotes is fully working. Except for in the case where multiple files in the tree use the same key, and the files are sent to the proxied remote before pushing the tree. It seems that even special remotes without annexobjects=yes will work if used with the workflow where the git-annex branch is pushed before copying files. But not with the `git-annex push` workflow.	2024-08-07 09:47:34 -04:00
matrss	3ccbcc5662		2024-08-07 12:12:29 +00:00
git-annex@82b5fddc759dffdf749b19add6f0be2a0c78b62c	d3cc84db3b		2024-08-07 12:05:53 +00:00
git-annex@82b5fddc759dffdf749b19add6f0be2a0c78b62c	e8f60e7daa		2024-08-07 12:04:42 +00:00
Joey Hess	ba1cb517c0	update	2024-08-06 14:46:56 -04:00
Joey Hess	c53f61e93f	Merge branch 'master' into exportreeplus	2024-08-06 14:46:33 -04:00
Joey Hess	f01d872059	fixed	2024-08-06 14:42:46 -04:00
Joey Hess	3289b1ad02	proxying to exporttree=yes annexobjects=yes basically working It works when using git-annex sync/push/assist, or when manually sending all content to the proxied remote before pushing to the proxy remote. But when the push comes before the content is sent, sending content does not update the exported tree.	2024-08-06 14:21:23 -04:00
Joey Hess	be5c86c248	refine	2024-08-06 12:15:18 -04:00
Joey Hess	4750ffbd3b	finalized design for proxying to exporttree=yes annexobjects=yes special remotes	2024-08-06 11:45:45 -04:00
Joey Hess	84d27cf34f	update	2024-08-06 11:13:51 -04:00
matrss	6d1592f857		2024-08-06 12:44:18 +00:00
Spencer	66ff2bc833	Added a comment: D: Correct	2024-08-05 22:17:55 +00:00

1 2 3 4 5 ...

34698 commits