git-annex

Author	SHA1	Message	Date
Joey Hess	9d29b99ac4	add news item for git-annex 10.20240831	2024-08-31 19:50:36 -04:00
Joey Hess	b3dc656153	releasing package git-annex version 10.20240831	2024-08-31 19:50:26 -04:00
Joey Hess	35ff8c8c00	use Utility.PID fixes build on i386ancient	2024-08-30 14:56:38 -04:00
Joey Hess	3b74386ed4	fix liveupdate locking This fixes the build on windows. Changed it to use lock pools, which will behave better if two threads call getLiveRepoSizes at the same time. Also this should make it work when annex.pidlock is set. In that case, once the current process locks this file, or anything, any other process will have to wait on the pid lock. So checkStaleSizeChanges will correctly identify any other live changes in the database as stale, since there can only be one git-annex process running.	2024-08-30 14:49:18 -04:00
Joey Hess	c10d11959a	fix paste oops Wow, I pasted a big thing into entirely the wrong file, but it was in a comment so it compiled anyway.	2024-08-30 14:35:05 -04:00
Joey Hess	698d9252a5	mention sizebalanced as well as balanced	2024-08-30 12:06:45 -04:00
Joey Hess	133584a83a	avoid locking the journal in readonly repository The test suite flagged that git-annex info in a readonly repository was no longer working. .git/annex/journal.lck: openFd: permission denied This fixes it, however, in a case where .git/annex/reposize/ is writable, but .git/annex/journal/ is not, there will still be a permission denied error. The solution would just be to use consistent permissions I suppose.	2024-08-30 11:58:10 -04:00
Joey Hess	53b7375cc6	update	2024-08-30 11:14:45 -04:00
Joey Hess	54b6151412	document using balanced preferred content in a cluster	2024-08-30 11:08:32 -04:00
Joey Hess	d0938d730b	Merge branch 'master' into balanced	2024-08-30 11:01:39 -04:00
Joey Hess	242c525659	lookupkey: Allow using --ref in a bare repository.	2024-08-30 10:55:48 -04:00
yarikoptic	e2b7895cbc	Added a comment	2024-08-29 18:35:47 +00:00
Joey Hess	d876e06e35	err on the side of larger repository size When a live update is removing a key, it might fail. So only count those once they have succeeded. When a live update is adding a key, count it immediately to avoid over-filling a repo. This also makes the 1 minute delay between stale live changes checks more defensible, because a stale live change can only cause us to err more on the side of caution.	2024-08-28 14:13:12 -04:00
Joey Hess	f89a1b8216	remove stale live changes from reposize database Reorganized the reposize database directory, and split up a column. checkStaleSizeChanges needs to run before needLiveUpdate, otherwise the process won't be holding a lock on its pid file, and another process could go in and expire the live update it records. It just so happens that they do get called in the correct order, since checking balanced preferred content calls getLiveRepoSizes before needLiveUpdate. The 1 minute delay between checks is arbitrary, but will avoid excess work. The downside of it is that, if a process is dropping a file and gets interrupted, for 1 minute another process can expect a repository will soon be smaller than it is. And so a process might send data to a repository when a file is not really going to be dropped from it. But note that can already happen if a drop takes some time in eg locking and then fails. So it seems possible that live updates should only be allowed to increase, rather than decrease the size of a repository.	2024-08-28 13:57:25 -04:00
Joey Hess	278adbb726	combine 2 queries	2024-08-28 11:00:59 -04:00
Joey Hess	e006acef22	avoid reposize database locking overhead when not needed Only when the preferred content expression being matched uses balanced preferred content is this overhead needed. It might be possible to eliminate the locking entirely. Eg, check the live changes before and after the action and re-run if they are not stable. For now, this is good enough, it avoids existing preferred content getting slow. If balanced preferred content turns out to be too slow to check, that could be tried later.	2024-08-28 10:52:34 -04:00
matrss	833150fd25	Added a comment	2024-08-28 14:11:36 +00:00
mih	16f9042046	Added a comment: Needed to retrieve single file metadata from bare repo	2024-08-28 13:58:30 +00:00
matrss	3f62116d64	Added a comment	2024-08-28 08:47:33 +00:00
Joey Hess	09955deebe	fix a deadlock when not using --auto Live update never gets started, but then it still waited for it to finish. This only deadlocked with -J4 or so, not without -J. Unsure why.	2024-08-27 15:47:57 -04:00
Joey Hess	b01a63ef62	avoid nub There will not usually be many live changes, but usually does not mean ever, and O(N^2) is best avoided.	2024-08-27 15:00:10 -04:00
Joey Hess	0a119184e6	thoughts	2024-08-27 14:59:13 -04:00
Joey Hess	8555fb88ef	locking in checkLiveUpdate This makes sure that two threads don't check balanced preferred content at the same time, so each thread always sees a consistent picture of what is happening. This does add a fairly expensive file level lock to every check of preferred content, in commands that use prepareLiveUpdate. It would be good to only do that when live updates are actually needed, eg when the preferred content expression uses balanced preferred content.	2024-08-27 13:12:43 -04:00
Joey Hess	4d2f95853d	closing in on finishing live reposizes Fixed successfullyFinishedLiveSizeChange to not update the rolling total when a redundant change is in RecentChanges. Made setRepoSizes clear RecentChanges that are no longer needed. It might be possible to clear those earlier, this is only a convenient point to do it. The reason it's safe to clear RecentChanges here is that, in order for a live update to call successfullyFinishedLiveSizeChange, a change must be made to a location log. If a RecentChange gets cleared, and just after that a new live update is started, making the same change, the location log has already been changed (since the RecentChange exists), and so when the live update succeeds, it won't call successfullyFinishedLiveSizeChange. The reason it doesn't clear RecentChanges when there is a reduntant live update is because I didn't want to think through whether or not all races are avoided in that case. The rolling total in SizeChanges is never cleared. Instead, calcJournalledRepoSizes gets the initial value of it, and then getLiveRepoSizes subtracts that initial value from the current value. Since the rolling total can only be updated by updateRepoSize, which is called with the journal locked, locking the journal in calcJournalledRepoSizes ensures that the database does not change while reading the journal.	2024-08-27 12:54:46 -04:00
Joey Hess	23d44aa4aa	use live reposizes in balanced preferred content	2024-08-27 10:17:43 -04:00
Joey Hess	d7813876a0	fixed the build Manually tested getLiveRepoSizes and it is working correctly.	2024-08-27 09:41:35 -04:00
Joey Hess	521e0a7062	fix a deadlock When finishedLiveUpdate was run on a different key than expected, it blocked forever waiting for an indication the database had been updated. Since the journal is locked when finishedLiveUpdate runs, this could also have caused other git-annex commands to hang.	2024-08-27 00:13:54 -04:00
Spencer	949be665c0	Added contributions section to track my bugs and inquiries	2024-08-26 20:02:03 +00:00
Joey Hess	21608716bd	started work on getLiveRepoSizes Doesn't quite compile	2024-08-26 14:50:09 -04:00
Joey Hess	db89e39df6	partially fix concurrency issue in updating the rollingtotal It's possible for two processes or threads to both be doing the same operation at the same time. Eg, both dropping the same key. If one finishes and updates the rollingtotal, then the other one needs to be prevented from later updating the rollingtotal as well. And they could finish at the same time, or with some time in between. Addressed this by making updateRepoSize be called with the journal locked, and only once it's been determined that there is an actual location change to record in the log. updateRepoSize waits for the database to be updated. When there is a redundant operation, updateRepoSize won't be called, and the redundant LiveUpdate will be removed from the database on garbage collection. But: There will be a window where the redundant LiveUpdate is still visible in the db, and processes can see it, combine it with the rollingtotal, and arrive at the wrong size. This is a small window, but it still ought to be addressed. Unsure if it would always be safe to remove the redundant LiveUpdate? Consider the case where two drops and a get are all running concurrently somehow, and the order they finish is [drop, get, drop]. The second drop seems redundant to the first, but it would not be safe to remove it. While this seems unlikely, it's hard to rule out that a get and drop at different stages can both be running at the same time.	2024-08-26 09:43:32 -04:00
Joey Hess	03c7f99957	todo	2024-08-25 10:48:42 -04:00
Joey Hess	18f8d61f55	rolling total of size changes in RepoSize database When a live size change completes successfully, the same transaction that removes it from the database updates the rolling total for its repository. The idea is that when RepoSizes is read, SizeChanges will be as well, and cached locally. Any time a change is made, the local cache will be updated. So by comparing the local cache with the current SizeChanges, it can learn about size changes that were made by other processes. Then read the LiveSizeChanges, and add that in to get a live picture of the current sizes. Also added a SizeChangeId. This allows 2 different threads, or processes, to both record a live size change for the same repo and key, and update their own information without stepping on one-another's toes.	2024-08-25 10:34:47 -04:00
Joey Hess	9188825a4d	use FileSize It's just an alias, so this doesn't change the db schema, but it makes explicit that it's not stored as an int64	2024-08-25 08:22:40 -04:00
Joey Hess	2b037d36a1	update	2024-08-24 15:06:00 -04:00
Joey Hess	6660984442	update	2024-08-24 13:15:39 -04:00
Joey Hess	d60a33fd13	improve live update starting In an expression like "balanced=foo and exclude=bar", avoid it starting a live update when the overall expression doesn't match.	2024-08-24 13:07:05 -04:00
Joey Hess	16f945459c	todo	2024-08-24 11:58:17 -04:00
Joey Hess	2f20b939b7	LiveUpdate db updates working I've tested the behavior of the thread that waits for the LiveUpdate to be finished, and it does get signaled and exit cleanly when the LiveUpdate is GCed instead. Made finishedLiveUpdate wait for the thread to finish updating the database. There is a case where GC doesn't happen in time and the database is left with a live update recorded in it. This should not be a problem as such stale data can also happen when interrupted and will need to be detected when loading the database. Balanced preferred content expressions now call startLiveUpdate.	2024-08-24 11:49:58 -04:00
Joey Hess	84d1bb746b	LiveUpdate for clusters	2024-08-24 10:20:12 -04:00
Joey Hess	18cd8bf43a	punt on LiveUpdate plumbing through assistant for now	2024-08-24 09:37:24 -04:00
Joey Hess	1d51f18dd0	remove FIXME Using NoLiveUpdate here is appropriate, because this is running the server side of the P2P protocol. There no preferred content checking is done.	2024-08-24 09:34:22 -04:00
Joey Hess	3f8675f339	more LiveUpdate plumbing	2024-08-24 09:28:41 -04:00
Joey Hess	eb841ab004	plumb in LiveUpdate to copy/get/move/mirror copy and get do check preferred content, so need to prepareLiveUpdate. move and mirror do not, but copy is implemented using move, so move also needed to have a LiveUpdate plumbed through.	2024-08-24 09:20:58 -04:00
Joey Hess	418fbf3f2f	NoLiveExport for export and import While these do check preferred content, it would not make sense to use balanced preferred content with them.	2024-08-24 09:19:12 -04:00
yarikoptic	efdee386c0	initial report on desire to do handle pathspecs	2024-08-24 01:35:31 +00:00
yarikoptic	c3877f648c	initial idea on another ability for get	2024-08-24 01:23:04 +00:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	4885073377	add live size changes to RepoSize database Not yet used.	2024-08-23 12:51:00 -04:00
Joey Hess	dad1fb150f	update	2024-08-23 11:45:36 -04:00
Joey Hess	d0ab1550ec	possible design to address reposizes concurrency issues	2024-08-23 11:19:38 -04:00

1 2 3 4 5 ...

45518 commits