git-annex

Author	SHA1	Message	Date
Joey Hess	76ece2a699	make --rebalance of balanced use fullysizebalanced when useful When the specified number of copies is > 1, and some repositories are too full, it can be better to move content from them to other less full repositories, in order to make space for new content. annex.fullybalancedthreshhold is documented, but not implemented yet This is not tested very well yet, and is known to sometimes take several runs to stabalize.	2024-08-21 17:59:08 -04:00
Joey Hess	9e87061de2	Support "sizebalanced=" and "fullysizebalanced=" too Might want to make --rebalance turn balanced=group:N where N > 1 to fullysizebalanced=group:N. Have not yet determined if that will improve situations enough to be worth the extra work.	2024-08-21 15:01:54 -04:00
Joey Hess	4e1dcc0372	bug	2024-08-21 12:18:31 -04:00
Joey Hess	2ec4602e36	fix column width	2024-08-21 12:18:16 -04:00
Joey Hess	de7ac1bb70	fix	2024-08-20 13:52:46 -04:00
Joey Hess	476d223bce	implement fullbalanced=group:N Rebalancing this when it gets into a suboptimal situation will need further work.	2024-08-20 13:51:02 -04:00
Matthew	4a9e637d36	Added a comment: Help with .nfsXXXX files	2024-08-19 21:20:59 +00:00
Joey Hess	d4b2f8201d	add %full field to table	2024-08-19 11:41:48 -04:00
matrss	9cfdae4c3b	Added a comment	2024-08-19 10:25:13 +00:00
Joey Hess	68a99a8f48	size based rebalancing design	2024-08-18 16:25:12 -04:00
Joey Hess	99514f9d18	maxsize overview display and --json support	2024-08-18 12:08:13 -04:00
xentac	74b953cded	Added a comment	2024-08-18 03:17:12 +00:00
Joey Hess	016edcf437	adjust countdown number for RepoSize update message Benchmarking a git-annex branch with half a million files changed, it takes about 1 minute to update the RepoSizes. So this will display the message after a few seconds.	2024-08-17 15:59:07 -04:00
Joey Hess	f985c58d8e	consistently don't show sizes of empty repositories This used to be the case, and when matching options are used, that code path still omits them, so also omit them in the getRepoSize code path.	2024-08-17 15:09:16 -04:00
Joey Hess	b62b58b50b	git-annex info speed up using getRepoSizes	2024-08-17 14:54:31 -04:00
Joey Hess	d09a005f2b	update RepoSize database from git-annex branch incrementally The use of catObjectStream is optimally fast. Although it might be possible to combine this with git-annex branch merge to avoid some redundant work. Benchmarking, a git-annex branch that had 100000 files changed took less than 1.88 seconds to run through this.	2024-08-17 13:35:00 -04:00
Joey Hess	8239824d92	consistently omit clusters when calculating RepoSizes updateRepoSize is only called on the UUID of a repository, not any cluster it might be a node of. But overLocationLogs and overLocationLogsJournal were inclusing cluster UUIDs. So it was inconsistent. Currently I don't see any reason to calculate RepoSize for a cluster. It's not even clear what it should mean, the total size of all nodes, or the amount of information stored in the cluster in total?	2024-08-17 11:24:14 -04:00
Spencer	40b49e2ddd	Added a comment: Remote Helper?	2024-08-17 05:33:01 +00:00
matrss	bcf876e3a0		2024-08-16 15:52:32 +00:00
matrss	f057010086	Added a comment	2024-08-16 15:45:45 +00:00
Joey Hess	61d95627f3	fix Annex.repoSize sharing between threads	2024-08-16 10:56:51 -04:00
Joey Hess	e361b9ea3c	todo	2024-08-15 16:15:48 -04:00
Joey Hess	63ccf6ffa7	todo	2024-08-15 13:50:50 -04:00
Joey Hess	4a0c7e2b2c	update	2024-08-15 13:41:47 -04:00
Joey Hess	a2da9c526b	RepoSize concurrency fix When loading the journalled repo sizes, make sure that the current process is prevented from making changes to the journal in another thread.	2024-08-15 13:37:41 -04:00
Joey Hess	06064f897c	update Annex.reposizes when changing location logs The live update is only needed when Annex.reposizes has already been populated.	2024-08-15 13:27:14 -04:00
Joey Hess	c376b1bd7e	show message when doing possibly expensive from scratch reposize calculation	2024-08-15 12:42:36 -04:00
Joey Hess	c200523bac	implement getRepoSizes At this point the RepoSize database is getting populated, and it all seems to be working correctly. Incremental updates still need to be done to make it performant.	2024-08-15 12:31:56 -04:00
Joey Hess	bba23e7cc9	do not need a db queue This database is read once and written at most once per run.	2024-08-15 12:31:27 -04:00
Joey Hess	eac4e9391b	finalize RepoSize database Including locking on creation, handling of permissions errors, and setting repo sizes. I'm confident that locking is not needed while using this database. Since writes happen in a single transaction. When there are two writers that are recording sizes based on different git-annex branch commits, one will overwrite what the other one recorded. Which is fine, it's only necessary that the database stays consistent with the content of a git-annex branch commit.	2024-08-15 12:29:34 -04:00
Atemu	e8997d8899	Added a comment	2024-08-15 15:40:20 +00:00
Joey Hess	63a3cedc45	slightly improve hairy types	2024-08-14 16:04:18 -04:00
Joey Hess	3e6eb2a58d	implement journalledRepoSizes Plan is to run this when populating Annex.reposizes on demand. So Annex.reposizes will be up-to-date with the journal, including crucially journal entries for private repositories. But also anything that has been written to the journal by another process, especially if the process was ran with annex.alwayscommit=false. From there, Annex.reposizes can be kept up to date with changes made by the running process.	2024-08-14 13:53:24 -04:00
pedro-lopes-de-azevedo	c75ecc5350	Added a comment: parameter --from not accepted	2024-08-14 14:27:54 +00:00
Joey Hess	8ac2685b33	calcBranchRepoSizes without journal files This will be used to prime the RepoSizes database, which will always contain values that correpond to information in the git-annex branch, so without anything from journal files. Factored out overJournalFileContents which will later be used to update Annex.reposizes to include information from journal files. This will be partitcularly important to support private UUIDs which only ever get to journal files and not to the branch.	2024-08-14 03:19:30 -04:00
bvaa	11eb2ae6ec	Added a comment	2024-08-14 07:18:26 +00:00
Joey Hess	90a79a6c1e	plan	2024-08-13 15:13:30 -04:00
Joey Hess	343c87db45	improve haddocks	2024-08-13 15:05:49 -04:00
Joey Hess	f612ebb934	avoid changing git-annex info behavior `5afbea25e7` changed it to ignore journal files that did not correspond to a key in the git-annex branch. However, when there is a private journal, that can happen. Neither behavior is fully correct, so keep the old incorrect behavior rather than introducing a new differently incorrect behavior. I plan to eventually make git-annex info use Annex.reposizes instead of calculating it itself, and once Annex.reposizes handles this all correctly, this will be a moot problem.	2024-08-13 14:17:20 -04:00
Joey Hess	a979d8da41	update	2024-08-13 14:14:47 -04:00
Joey Hess	08f55948e9	take all repository locations into account for balancing This fully fixes --rebalance stability, and also deals with an issue where a file is present in each balanced repository and it didn't want to remove it from any.	2024-08-13 13:46:47 -04:00
Joey Hess	10d8b3cc63	fixed --rebalance stability on drop Was checking the wrong uuid, oops	2024-08-13 13:32:11 -04:00
Joey Hess	5afbea25e7	avoid counting size of keys that are in the journal twice In calcRepoSizes and also git-annex info, when a key was in the journal, it was passed to the callback twice, so the calculated size was wrong.	2024-08-13 13:23:39 -04:00
Joey Hess	467d80101a	improve handling of unmerged git-annex branches in readonly repo git-annex info was displaying a message that didn't make sense in context. In calcRepoSizes, it seems better to return the information from the git-annex branch, rather than giving up. Especially since balanced preferred content uses it, and we can't just give up evaluating a preferred content expression if git-annex is to be usable in such a readonly repo. Commit `6d7ecd9e5d` nobly wanted git-annex to behave the same with such unmerged branches as it does when it can merge them. But for the purposes of preferred content, it seems to me there's a sense that such an unmerged branch is the same as a remote we have not pulled from. The balanced preferred content will either way operate under outdated information, and so make not the best choices.	2024-08-13 13:13:12 -04:00
Joey Hess	5c35b3d579	fix typo	2024-08-13 11:47:37 -04:00
Joey Hess	745bc5c547	take maxsize into account for balanced preferred content This is very innefficient, it will need to be optimised not to calculate the sizes of repos every time. Also, fixed a bug in balancedPicker that caused it to pick a too high index when some repos were excluded due to being full.	2024-08-13 11:00:20 -04:00
Spencer	05a62e4e5f	Added a comment: Workaround: --force-small	2024-08-13 07:05:57 +00:00
Spencer	3d252da06c	Added a comment: Exact Moment Things Go Wrong	2024-08-13 06:22:11 +00:00
Spencer	ab5f920d77	.md linting	2024-08-13 04:46:53 +00:00
Spencer	8a91a8c208		2024-08-13 04:46:10 +00:00

1 2 3 4 5 ...

45560 commits