For the record, `git annex add` has had a series of memory leaks. Mostly these are minor -- until you need to check in a few million files in a single operation. If this happens to you, git-annex will run out of memory and stop. (Generally well before your system runs out of memory, since it has some built-in ulimits.) You can recover by just re-running the `git annex add` -- it will automatically pick up where it left off. A history of the leaks: * Originally, `git annex add` remembered all the files it had added, and fed them to git at the end. Of course that made its memory use grow, so it was fixed to periodically flush its buffer. Fixed in version 0.20110417. * Something called a "lazy state monad" caused "thunks" to build up and memory to leak. Also affected other git annex commands than `add`. Adding files using a SHA* backend hit the worst. Fixed in versions afer 3.20120123. * Committing journal files turned out to have another memory leak. After adding a lot of files ran out of memory, this left the journal behind and could affect other git-annex commands. Fixed in versions afer 3.20120123. * Something is still causing a slow leak when adding files. I tested by adding many copies of the whole linux kernel tree into the annex using the WORM backend, and once it had added 1 million files, git-annex used ~100 mb of ram. That's 100 bytes leaked per file on average .. roughly the size of a filename? It's worth noting that `git add` uses more memory than that in such a large tree. **not fixed yet** * (Note that `git ls-files --others`, which is used to find files to add, also uses surpsisingly large amounts of memory when you have a lot of files. It buffers the entire list, so it can compare it with the files in the index, before outputting anything. This is Not Our Problem, but I'm sure the git developers would appreciate a patch that fixes it.)