This was a real PITA to fix, since location logs can be staged in
both the current repo, as well as in local remote's repos, in
which case the cwd will not be in the repo. And git add needs different
params in both cases, when absolute paths are not used.
In passing, git annex fsck now stages location log fixes.
The test suite will not be run if it cannot be compiled.
It may be possible later to split off the quickcheck using tests into
a separate program and keep most of the tests using just hunit.
The remaining leaks are in hS3. The leak with encryption was worked around
by the use of the temp file. (And was probably originally caused by
gpgCipherHandle sparking a thread which kept a reference to the start
of the byte string.)
* Update Debian build dependencies for ghc 7.
* Debian package is now built with S3 support. Thanks Joachim Breitner for
making this possible, also thanks Greg Heartsfield for working to improve
the hS3 library for git-annex.
Also hid a conflicting new symbol from Control.Monad.State
This was a most surprising leak. It occurred in the process that is forked
off to feed data to gpg. That process was passed a lazy ByteString of
input, and ghc seemed to not GC the ByteString as it was lazily read
and consumed, so memory slowly leaked as the file was read and passed
through gpg to bup.
To fix it, I simply changed the feeder to take an IO action that returns
the lazy bytestring, and fed the result directly to hPut.
AFAICS, this should change nothing WRT buffering. But somehow it makes
ghc's GC do the right thing. Probably I triggered some weakness in ghc's
GC (version 6.12.1).
(Note that S3 still has this leak, and others too. Fixing it will involve
another dance with the type system.)
Update: One theory I have is that this has something to do with
the forking of the feeder process. Perhaps, when the ByteString
is produced before the fork, ghc decides it need to hold a pointer
to the start of it, for some reason -- maybe it doesn't realize that
it is only used in the forked process.
Stalls were caused by code that did approximatly:
content' <- liftIO $ withEncryptedContent cipher content return
store content'
The return evaluated without actually reading content from S3,
and so the cleanup code began waiting on gpg to exit before
gpg could send all its data.
Fixing it involved moving the `store` type action into the IO monad:
liftIO $ withEncryptedContent cipher content store
Which was a bit of a pain to do, thank you type system, but
avoids the problem as now the whole content is consumed, and
stored, before cleanup.
Since the queue is flushed in between subcommand actions being run,
there should be no issues with actions that expect to queue up some stuff
and have it run after they do other stuff. So I didn't have to audit for
such assumptions.
to avoid some issues with git on OSX with the mixed-case directories. No
migration is needed; the old mixed case hash directories are still read;
new information is written to the new directories.
So, it would be nicer to just use Cabal and take advantage
of its conditional compilation support. But, Cabal seems to
lack good support for a package with an internal library that is used by
multiple executables. It wants to build everything twice or more.
That's too slow for me.
Anyway, fairly soon, I expect to upgrade hS3 to a requirment, and I
can just revert this.
For example, this could happen if using SHA1 and a file with content
"foo" were added to that backend. Then a file with "content" foo were
migrated from the WORM backend.
Assume that, if a backend assigned the same key, the already annexed
content must be the same. So, the "old" content can be reused.