comment
This commit is contained in:
parent
40017089f2
commit
594110a6af
1 changed files with 28 additions and 0 deletions
|
@ -0,0 +1,28 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 5"""
|
||||||
|
date="2023-06-01T17:47:31Z"
|
||||||
|
content="""
|
||||||
|
As far as how reads from the cidsdb scale, I think that sqlite databases
|
||||||
|
like this do slow down somewhat when tables get massive, but I don't really
|
||||||
|
know the details. And it's of course possible that something could be
|
||||||
|
improved in the schema or queries.
|
||||||
|
|
||||||
|
I've been working on that todo I linked above, and the speed gain is
|
||||||
|
impressive when there are few or no changed files in the remote. With 20,000
|
||||||
|
unchanged files, re-running git-annex import[1] sped up from 125.95 to 3.84
|
||||||
|
seconds. With 40,000 unchanged files, it sped up from 477 to 8.13 seconds. I
|
||||||
|
haven't tried with 150000 files yet but the pattern is clear.
|
||||||
|
|
||||||
|
> I can rerun the sync with an unchanged import directory. It still takes
|
||||||
|
> 107 minutes, the majority of which is spent reading cidsdb. Only the
|
||||||
|
> first minute or two are spent scanning the source area.
|
||||||
|
|
||||||
|
Well, I think I've certianly solved that problem. But I don't know if there's
|
||||||
|
something else that is making the initial sync slower than it needs to
|
||||||
|
be.
|
||||||
|
|
||||||
|
[1] More accurately, re-running it a second time, both to get a warm cache
|
||||||
|
result, and because the first time, it is busy updating the cidsdb with
|
||||||
|
the files that were imported earlier, as described in comment #2.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue