promote comment to bug

This commit is contained in:
Joey Hess 2021-10-05 11:55:33 -04:00
parent e217d88cd6
commit a8ceb2b64e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 40 additions and 0 deletions

View file

@ -0,0 +1,5 @@
When I tried running `git annex sync borg` on a large (~6T) borg repo with many archives, git-annex spun until it used 52G of
memory, then got OOM-killed.
I don't know if this is a memory leak or just trying to load too much, but it seems like this is a thing you should be able to do on
a machine with 64G of RAM.

View file

@ -0,0 +1,25 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2021-10-05T15:12:33Z"
content="""
I'd expect the amount of memory git-annex uses to increase with the number
of archives in the borg repo that contain a git-annex repository. So I am
curious how many such archives there are in your borg repo.
The memory use also scales with the number of annex object files in the git-annex
repository. So I'm curious how many such files there are in one of the
borg archives.
If there are say, 1000 archives of a git-annex repository that
contains 1000 annex objects, that's a million items. I'd estimate a couple
hundred megabytes memory for for that. The length of the path to the
git-annex repository and the archive name are included in each item, so more
when those are long.
If it could only include one item for each git-annex key, that would avoid
needing so much memory. But I don't think it can, because an archive can
be deleted, and if the one item it included was in the deleted archive,
it would not be able to retrieve the object from other archives that still
exists, without a rescan.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2021-10-05T15:12:00Z"
content="""
I have opened a bug for that issue,
[[bugs/borg_special_remote_memory_usage_high_for_large_borg_repo]].
It would be good if you could followup there with details.
"""]]