This commit is contained in:
Joey Hess 2022-07-14 13:51:59 -04:00
parent 06981c6c5a
commit 557542d621
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 41 additions and 0 deletions

View file

@ -0,0 +1,27 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2022-07-14T17:07:59Z"
content="""
Writing to the journal is currently atomic. And git-annex does take
advantage of that atomicity, by not locking the journal when it's reading
from it in some cases, the most used of which is Annex.Branch.get.
But, it's not guaranteed that an append is atomic. A short enough append
may be, but how short may vary, and it's not well defined. Here is an
example of a short write append that gets interrupted in the middle by a
kill signal: <https://bugzilla.kernel.org/show_bug.cgi?id=55651>
> Unfortunately as far as I can determine, in the POSIX and Linux
> standards, there is no way to work around this new behavior.
> There's no way to ensure that some amount of data no matter now small,
> even just two bytes, are written out to a file as an atomic transfer
> (either aborted and no bytes written or is completely written out.)
So appending would need more locking of the journal, which would add
some overhead to everything. And especially would hurt concurrency.
Also, the journal is currently crash-safe. Even if there's a sudden
power loss, the write either completed or didn't happen. Appending
would lose that nice property.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="joey"
subject="""comment 6"""
date="2022-07-14T17:45:31Z"
content="""
@yarikoptic ok, please check and, if you can do that,
I'll implement the buffering of urls for a key.
It looks like appending is not feasible..
Only other approach I can think of would be to have a switch that makes
git-annex buffer branch writes in memory, rather than using the journal,
and commit at the end, or when the buffer got too large.
"""]]