git-annex/doc/design/assistant/blog/day_7__bugfixes.mdwn
Joey Hess 7189dfd77d git-annex (5.20131127) unstable; urgency=low
* webapp: Detect when upgrades are available, and upgrade if the user
    desires.
    (Only when git-annex is installed using the prebuilt binaries
    from git-annex upstream, not from eg Debian.)
  * assistant: Detect when the git-annex binary is modified or replaced,
    and either prompt the user to restart the program, or automatically
    restart it.
  * annex.autoupgrade configures both the above upgrade behaviors.
  * Added support for quvi 0.9. Slightly suboptimal due to limitations in its
    interface compared with the old version.
  * Bug fix: annex.version did not get set on automatic upgrade to v5 direct
    mode repo, so the upgrade was performed repeatedly, slowing commands down.
  * webapp: Fix bug that broke switching between local repositories
    that use the new guarded direct mode.
  * Android: Fix stripping of the git-annex binary.
  * Android: Make terminal app show git-annex version number.
  * Android: Re-enable XMPP support.
  * reinject: Allow to be used in direct mode.
  * Futher improvements to git repo repair. Has now been tested in tens
    of thousands of intentionally damaged repos, and successfully
    repaired them all.
  * Allow use of --unused in bare repository.

# imported from the archive
2013-11-27 18:41:44 -04:00

45 lines
2 KiB
Markdown

Kickstarter is over. Yay!
Today I worked on the bug where `git annex watch` turned regular files
that were already checked into git into symlinks. So I made it check
if a file is already in git before trying to add it to the annex.
The tricky part was doing this check quickly. Unless I want to write my
own git index parser (or use one from Hackage), this check requires running
`git ls-files`, once per file to be added. That won't fly if a huge
tree of files is being moved or unpacked into the watched directory.
Instead, I made it only do the check during `git annex watch`'s initial
scan of the tree. This should be OK, because once it's running, you
won't be adding new files to git anyway, since it'll automatically annex
new files. This is good enough for now, but there are at least two problems
with it:
* Someone might `git merge` in a branch that has some regular files,
and it would add the merged in files to the annex.
* Once `git annex watch` is running, if you modify a file that was
checked into git as a regular file, the new version will be added
to the annex.
I'll probably come back to this issue, and may well find myself directly
querying git's index.
---
I've started work to fix the memory leak I see when running `git annex
watch` in a large repository (40 thousand files). As always with a Haskell
memory leak, I crack open [Real World Haskell's chapter on profiling](http://book.realworldhaskell.org/read/profiling-and-optimization.html).
Eventually this yields a nice graph of the problem:
[[!img profile.png alt="memory profile"]]
So, looks like a few minor memory leaks, and one huge leak. Stared
at this for a while and trying a few things, and got a much better result:
[[!img profile2.png alt="memory profile"]]
I may come back later and try to improve this further, but it's not bad memory
usage. But, it's still rather slow to start up in such a large repository,
and its initial scan is still doing too much work. I need to optimize
more..