This commit is contained in:
Joey Hess 2022-07-18 14:45:03 -04:00
parent 1c40b927aa
commit efee53f433
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
2 changed files with 44 additions and 0 deletions

View file

@ -0,0 +1,33 @@
[[!comment format=mdwn
username="joey"
subject="""comment 13"""
date="2022-07-18T18:01:09Z"
content="""
The `append` branch has basic appending implemented, but it's not yet
done atomically.
For benchmarking, I'm using this command.
perl -e 'for (1..'$ITER') { print "WORM--foo http://example.com/$_\n" }' | /usr/bin/time git-annex registerurl --batch
ITER=2000
Old: 52s
Appending: 28s
Appending without reading old value: 2s
ITER=4000
Old: 190s
Appending: 111s
Appending without reading old value: 5s
So an improvement of 50%. But remains nonlinear even when appending,
because it needs to read the existing log file each time to determine if it
can append, or if it needs to compact it. (Disk cache didn't work as well
as I had hoped.)
What this suggests to me is that it would be good to also add a mode that
blindly appends without compacting. Or, possibly, to blindly append,
but then compact the journalled file before committing it to the git-annex
branch.
"""]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="joey"
subject="""re: comment 12"""
date="2022-07-18T18:40:20Z"
content="""
@yarikoptic, the new git-annex would resolve the insonsistency the next
time it ran. Only when annex.alwayscommit=false would there be any time
window where the old git-annex missed something written by git-annex
process that ran before the one that got interrupted. This does not seem
like a large problem.
"""]]