finish bloom filters

Add tuning, docs, etc.

Not sure if status is the right place to remote size.. perhaps unused
should report the size and also warn if it sees more keys than the bloom
filter allows?
This commit is contained in:
Joey Hess 2012-03-12 16:18:14 -04:00
parent faf3a94fa7
commit 25809ce2e0
7 changed files with 64 additions and 8 deletions

View file

@ -598,6 +598,23 @@ Here are all the supported configuration settings.
of memory and are working with very large numbers of files, increasing
the queue size can speed it up.
* `annex.bloomcapacity`
The `git annex unused` command uses a bloom filter to determine
what data is no longer used. The default bloom filter is sized to handle
up to 500000 keys. If your repository is larger than that,
you can adjust this to avoid `git annex unused` not noticing some unused
data files. Increasing this will make `git-annex unused` consume more memory;
run `git annex status` for memory usage numbers.
* `annex.bloomaccuracy`
Adjusts the accuracy of the bloom filter used by
`git annex unused`. The default accuracy is 1000 --
1 unused file out of 1000 will be missed by `git annex unused`. Increasing
the accuracy will make `git annex unused` consume more memory;
run `git annex status` for memory usage numbers.
* `annex.version`
Automatically maintained, and used to automate upgrades between versions.

View file

@ -35,6 +35,7 @@ To build and use git-annex, you will need:
* [hS3](http://hackage.haskell.org/package/hS3)
* [json](http://hackage.haskell.org/package/json)
* [IfElse](http://hackage.haskell.org/package/IfElse)
* [bloomfilter](http://hackage.haskell.org/package/bloomfilter)
* Shell commands
* [git](http://git-scm.com/)
* [uuid](http://www.ossp.org/pkg/lib/uuid/)