git-annex

Author	SHA1	Message	Date
Joey Hess	8886ff1cff	done!	2021-08-04 12:40:25 -04:00
Joey Hess	9b9b5759b0	Merge branch 'vectorclock'	2021-08-04 12:39:54 -04:00
Joey Hess	1acdd18ea8	deal better with clock skew situations, using vector clocks * Deal with clock skew, both forwards and backwards, when logging information to the git-annex branch. * GIT_ANNEX_VECTOR_CLOCK can now be set to a fixed value (eg 1) rather than needing to be advanced each time a new change is made. * Misuse of GIT_ANNEX_VECTOR_CLOCK will no longer confuse git-annex. When changing a file in the git-annex branch, the vector clock to use is now determined by first looking at the current time (or GIT_ANNEX_VECTOR_CLOCK when set), and comparing it to the newest vector clock already in use in that file. If a newer time stamp was already in use, advance it forward by a second instead. When the clock is set to a time in the past, this avoids logging with an old timestamp, which would risk that log line later being ignored in favor of "newer" line that is really not newer. When a log entry has been made with a clock that was set far ahead in the future, this avoids newer information being logged with an older timestamp and so being ignored in favor of that future-timestamped information. Once all clocks get fixed, this will result in the vector clocks being incremented, until finally enough time has passed that time gets back ahead of the vector clock value, and then it will return to usual operation. (This latter situation is not ideal, but it seems the best that can be done. The issue with it is, since all writers will be incrementing the last vector clock they saw, there's no way to tell when one writer made a write significantly later in time than another, so the earlier write might arbitrarily be picked when merging. This problem is why git-annex uses timestamps in the first place, rather than pure vector clocks.) Advancing forward by 1 second is somewhat arbitrary. setDead advances a timestamp by just 1 picosecond, and the vector clock could too. But then it would interfere with setDead, which wants to be overrulled by any change. So it could use 2 picoseconds or something, but that seems weird. It could just as well advance it forward by a minute or whatever, but then it would be harder for real time to catch up with the vector clock when forward clock slew had happened. A complication is that many log files contain several different peices of information, and it may be best to only use vector clocks for the same peice of information. For example, a key's location log file contains InfoPresent/InfoMissing for each UUID, and it only looks at the vector clocks for the UUID that is being changed, and not other UUIDs. Although exactly where the dividing line is can be hard to determine. Consider metadata logs, where a field "tag" can have multiple values set at different times. Should it advance forward past the last tag? Probably. What about when a different field is set, should it look at the clocks of other fields? Perhaps not, but currently it does, and this does not seems like it will cause any problems. Another one I'm not entirely sure about is the export log, which is keyed by (fromuuid, touuid). So if multiple repos are exporting to the same remote, different vector clocks can be used for that remote. It looks like that's probably ok, because it does not try to determine what order things occurred when there was an export conflict. Sponsored-by: Jochen Bartl on Patreon	2021-08-04 12:33:46 -04:00
Ilya_Shlyakhter	af0fcf81a1	Added a comment: downloading torrent files to annex	2021-08-04 15:47:12 +00:00
Joey Hess	1a48d51e5b	devblog	2021-08-03 17:14:06 -04:00
Joey Hess	5c23489859	Merge branch 'master' of ssh://git-annex.branchable.com	2021-08-03 17:06:27 -04:00
Joey Hess	c67b1e31a6	branch	2021-08-03 17:05:50 -04:00
spwhitton	b28fb4305b	Added a comment	2021-08-03 19:10:10 +00:00
Joey Hess	629e95fd8e	update	2021-08-03 14:03:25 -04:00
Joey Hess	bb56186daa	new todo.. I seem to have cracked a longstanding problem Sponsored-by: Jochen Bartl on Patreon	2021-08-03 13:51:23 -04:00
jwrauch	aba263450a	Added a comment	2021-08-03 16:36:21 +00:00
Joey Hess	899983058f	add: When adding a dotfile, avoid treating its name as an extension.	2021-08-03 12:22:58 -04:00
Joey Hess	2572950ca7	add news item for git-annex 8.20210803	2021-08-03 12:21:10 -04:00
Joey Hess	9cae7c5bbf	releasing package git-annex version 8.20210803	2021-08-03 12:20:45 -04:00
Joey Hess	9b85d95333	whitespace	2021-08-03 12:18:10 -04:00
Ilya_Shlyakhter	6e67ea5d7c	Added a comment	2021-08-03 15:14:53 +00:00
Ilya_Shlyakhter	4ce24173e8	Added a comment: don't give up ;)	2021-08-03 15:06:35 +00:00
Lukey	2bd3be5430	Added a comment	2021-08-03 07:54:13 +00:00
Joey Hess	7334893d42	Merge branch 'master' of ssh://git-annex.branchable.com	2021-08-02 14:11:36 -04:00
Joey Hess	6111958440	fix test suite `14683da9eb` caused a test suite failure. When the content of a key is not present, a LinkAnnexFailed is returned, but replaceFile then tried to move the file into place, and since it was not written, that crashed. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2021-08-02 13:59:23 -04:00
Joey Hess	86bd9ac186	fix missing new lines in processTranscript	2021-08-02 13:42:27 -04:00
jwrauch	e732bc914f		2021-08-02 17:03:23 +00:00
yarikoptic	20ffa01087	initial observation of .dot filename to consider having .dot extension	2021-08-02 16:56:19 +00:00
jgsuess@732b8c62c50d8595d7b1d58eea11e5019c2308b1	70fe0862b4	Added a comment: Also seeing this behaviour	2021-08-02 06:14:41 +00:00
mattplasmastrike@59cb7d099c8665f6c7668d78d9f17db87cabc2fe	2ad890f3df	Added a comment	2021-07-31 23:59:12 +00:00
Joey Hess	b3c4579c79	work around strange auto-init bug git-annex get when run as the first git-annex command in a new repo did not populate unlocked files. (Reversion in version 8.20210621) I am not entirely happy with this, because I don't understand how `428c91606b` caused the problem in the first place, and I don't fully understand how skipping calling scanAnnexedFiles during autoinit avoids the problem. Kept the explicit call to scanAnnexedFiles during git-annex init, so that when reconcileStaged is expensive, it can be made to run then, rather than at some later point when the information is needed. Sponsored-by: Brock Spratlen on Patreon	2021-07-30 18:36:03 -04:00
Joey Hess	9f94d2894e	remove unused code	2021-07-30 18:01:36 -04:00
Joey Hess	748addbe05	remove second pass in scanAnnexedFiles The pass was needed to populate files when annex.thin was set, but in commit `73e0cbbb19`, reconcileStaged started to do that. So, this second pass is not needed any longer.	2021-07-30 17:46:11 -04:00
Joey Hess	c912e7c4fd	bug	2021-07-30 17:13:15 -04:00
Joey Hess	461035c6ec	close I'm now reasonably sure I've identified both cases where this can happen. v8 upgrades and certian filesystems eg NFS. Both are handled as well as can be, though it may involve some extra checksumming work.	2021-07-30 15:22:22 -04:00
Joey Hess	66089e97de	Fix a rounding bug in display of data sizes Eg, showImprecise 1 1.99 returned "1.1" rather than "2". The 9 rounded upward to 10, and that was wrongly used as the decimal, rather than carrying the 1. Sponsored-by: Jack Hill on Patreon	2021-07-30 09:56:04 -04:00
uli@8484a70fbfd489faef5f72c230d340b01e2676ca	1c4d6dee90	bugreport: git-annex info . formats 2 TB as 1.1 TB	2021-07-30 07:24:53 +00:00
Joey Hess	d2aead67bd	fsck: Detect and correct stale or missing inode caches for object files An easy way to see this in action is to have an unlocked file, and touch the object file. While all code that compares inode caches for object files needs to be prepared for this kind of problem and fall back to verification, having fsck notice it and correct it is cheap (as long as fsck is being run anyway) and ensures that if it happens for some unusual reason, there's a way for the user to notice that it's happening. Not that, when annex.thin is in use, the earlier call to isUnmodified (and also potentially earlier calls to inAnnex in eg, verifyLocationLog) will fix up the same problem silently. That might prevent the warning being displayed, although probably it still will be, because the Database.Keys write of the InodeCache will be queued but will not have happened yet. I can't see a way to improve this, but it's not great. Sponsored-by: Dartmouth College's Datalad project	2021-07-29 14:06:42 -04:00
Joey Hess	817ccbbc47	split verifyKeyContent This avoids it calling enteringStage VerifyStage when it's used in places that only fall back to verification rarely, and which might be called while in TransferStage and be going to perform a transfer after the verification.	2021-07-29 13:58:40 -04:00
Joey Hess	d4fc506f27	comment	2021-07-29 13:33:11 -04:00
Joey Hess	3c5280b1cf	improve comment wording	2021-07-29 13:21:23 -04:00
Joey Hess	897fd5c104	add note	2021-07-29 13:14:03 -04:00
Joey Hess	72a13d2a5f	remove unused parameter	2021-07-29 13:12:11 -04:00
Joey Hess	3af0e0b4de	Merge branch 'master' of ssh://git-annex.branchable.com	2021-07-29 12:30:39 -04:00
Joey Hess	067a9c70c7	simplify code	2021-07-29 12:28:13 -04:00
Joey Hess	3e0b210039	remove unncessary debugs Keeping the ones in Annex.InodeSentinal	2021-07-29 12:19:37 -04:00
mih	1e5c60132a	Added a comment: Cause cannot (only) be a v7->v8 upgrade	2021-07-29 05:40:36 +00:00
Joey Hess	a306560374	use SQL.addInodeCaches This avoids deadlock when opening the database handle calls reconcileStaged.	2021-07-27 17:34:56 -04:00
Joey Hess	73e0cbbb19	fix problem populating pointer files This is a result of an audit of every use of getInodeCaches, to find places that misbehave when the annex object is not in the inode cache, despite pointer files for the same key being in the inode cache. Unfortunately, that is the case for objects that were in v7 repos that upgraded to v8. Added a note about this gotcha to getInodeCaches. Database.Keys.reconcileStaged, then annex.thin is set, would fail to populate pointer files in this situation. Changed it to check if the annex object is unmodified the same way inAnnex does, falling back to a checksum if the inode cache is not recorded. Sponsored-by: Dartmouth College's Datalad project	2021-07-27 14:26:49 -04:00
Joey Hess	de482c7eeb	move verifyKeyContent to Annex.Verify The goal is that Database.Keys be able to use it; it can't use Annex.Content.Presence due to an import loop. Several other things also needed to be moved to Annex.Verify as a conseqence.	2021-07-27 14:07:23 -04:00
Joey Hess	0ec5919bbe	comment	2021-07-27 13:45:33 -04:00
Joey Hess	ac59507809	Merge branch 'master' of ssh://git-annex.branchable.com	2021-07-27 13:16:22 -04:00
Joey Hess	14683da9eb	fix potential race in updating inode cache Some uses of linkFromAnnex are inside replaceWorkTreeFile, which was already safe, but others use it directly on the work tree file, which was race-prone. Eg, if the work tree file was first removed, then linkFromAnnex called to populate it, the user could have re-written it in the interim. This came to light during an audit of all calls of addInodeCaches, looking for such races. All the other uses of it seem ok. Sponsored-by: Brett Eisenberg on Patreon	2021-07-27 13:08:08 -04:00
Joey Hess	e4b2a067e0	fix potential race in updating inode cache In Annex.Content, the object file was statted after pointer files were populated. But if annex.thin is set, once the pointer files are populated, the object file can potentially be modified via the hard link. So, it was possible, though seemingly very unlikely, for the inode of the modified object file to be cached. Command.Fix and Command.Fsck had similar problems, statting the work tree files after they were in place. Changed them to stat the temp file that gets moved into place. This does rely on .git/annex being on the same filesystem. If it's not, the cached inode will not be the same as the one that the temp file gets moved to. Result will be that git-annex will later need to do an expensive verification of the content of the worktree files. Note that the cross-filesystem move of the temp file already is a larger amount of extra work, so this seems acceptable. Sponsored-by: Luke Shumaker on Patreon	2021-07-27 12:29:10 -04:00
mih	c3e74710f9	Added a comment: Fixed in 8.20210715-g3b5a3e168	2021-07-27 12:00:35 +00:00

... 12 13 14 15 16 ...

40998 commits