optimise journal writes to not mkdir journal directory when it already exists
Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
parent
5e407304a2
commit
ad467791c1
4 changed files with 19 additions and 4 deletions
|
@ -27,3 +27,5 @@ May be changes to those .web files in journal could be done "in place" by append
|
|||
may be there is a way to "stagger" those --batch additions somehow so all thousands of URLs are added in a single "run" thus having a single "copy/move" and locking/stat'ing syscalls?
|
||||
|
||||
PS More information could be found at [dandisets/issues/225](https://github.com/dandi/dandisets/issues/225 )
|
||||
|
||||
[[!tag projects/dandi]]
|
||||
|
|
|
@ -9,9 +9,10 @@ randomly distributed?
|
|||
|
||||
It sounds like it's more randomly distributed, if you're walking a tree and
|
||||
adding each file you encounter, and some of them have the same content so
|
||||
the same url and key.
|
||||
the same key.
|
||||
|
||||
If it was not randomly distributed, a nice optimisation would be for
|
||||
But your stace shows repeated writes for the same key, so maybe they bunch
|
||||
up? If it was not randomly distributed, a nice optimisation would be for
|
||||
registerurl to buffer urls as long as the key is the same, and then do a
|
||||
single write for that key of all the urls. But it can't really buffer like
|
||||
that if it's randomly distributed; the buffer could use a large amount of
|
||||
|
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2022-07-14T16:16:35Z"
|
||||
content="""
|
||||
I've optimised away the repeated mkdir of the journal.
|
||||
|
||||
Probably not a big win in this particular edge case, but a nice general
|
||||
win..
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue