Merge branch 'master' into watch

This commit is contained in:
Joey Hess 2012-06-15 15:20:11 -04:00
commit 53d2e81ffd
22 changed files with 249 additions and 24 deletions

View file

@ -73,7 +73,7 @@ download url file = do
liftIO $ createDirectoryIfMissing True (parentDir tmp)
stopUnless (downloadUrl [url] tmp) $ do
backend <- chooseBackend file
let source = KeySource { keyFilename = file, contentLocation = file}
let source = KeySource { keyFilename = file, contentLocation = tmp }
k <- genKey source backend
case k of
Nothing -> stop

View file

@ -100,8 +100,6 @@ clean:
rm -rf tmp $(bins) $(mans) test configure *.tix .hpc $(sources) \
doc/.ikiwiki html dist $(clibs)
# Workaround for `cabal sdist` requiring all included files to be listed
# in .cabal.
sdist: clean $(mans)
./make-sdist.sh

11
debian/changelog vendored
View file

@ -1,12 +1,19 @@
git-annex (3.20120612) UNRELEASED; urgency=low
git-annex (3.20120616) UNRELEASED; urgency=low
* watch: New subcommand, which uses inotify to watch for changes to
files and automatically annexes new files, etc, so you don't need
to manually run git commands when manipulating files.
-- Joey Hess <joeyh@debian.org> Tue, 12 Jun 2012 11:35:59 -0400
git-annex (3.20120614) unstable; urgency=medium
* addurl: Was broken by a typo introduced 2 released ago, now fixed.
Closes: #677576
* Install man page when run by cabal, in a location where man will
find it, even when installing under $HOME. Thanks, Nathan Collins
-- Joey Hess <joeyh@debian.org> Tue, 12 Jun 2012 11:35:59 -0400
-- Joey Hess <joeyh@debian.org> Thu, 14 Jun 2012 20:21:29 -0400
git-annex (3.20120611) unstable; urgency=medium

View file

@ -14,6 +14,18 @@ available in the App Store.
* git (not all git commands are needed,
but core plumbing and a few like `git-add` are.)
## GHC Android?
Android's native SDK does not use glibc. GHC's runtime links with glibc.
This could be an enormous problem. Other people want to see GHC able to
target Android, of course, so someone may solve it before I get stuck on
it.
References:
* <http://stackoverflow.com/questions/5151858/running-a-haskell-program-on-the-android-os>
* <http://www.reddit.com/r/haskell/comments/ful84/haskell_on_android/>
### Android specific features
The app should be aware of power status, and avoid expensive background
@ -57,8 +69,6 @@ version git-annex dependend upon existing on the phone. (Maybe the phone
would have to be always considered an untrusted repo, which probably
makes sense anyway.)
Problem:
#### crazy `LD_PRELOAD` wrapper
Need I say more? (Also, Android's linker may not even support it.)

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
nickname="Richard"
subject="Battery usage"
date="2012-06-15T09:57:33Z"
content="""
Complete fsck is good, but once a week probably enough.
But please see if you can make fsck optional depending on if the machine is running on battery.
"""]]

View file

@ -0,0 +1,30 @@
git merge watch_
My cursor has been mentally poised here all day, but I've been reluctant to
merge watch into master. It seems solid, but is it correct? I was able to
think up a lot of races it'd be subject to, and deal with them, but did I
find them all?
Perhaps I need to do some automated fuzz testing to reassure myself.
I looked into using [genbackupdata](http://liw.fi/genbackupdata/) to that
end. It's not quite what I need, but could be
[moved in that direction](http://bugs.debian.org/677542). Or I could write
my own fuzz tester, but it seems better to use someone else's, because
a) laziness and b) they're less likely to have the same blind spots I do.
My reluctance to merge isn't helped by the known bugs with files that are
either already open before `git annex watch` starts, or are opened by two
processes at once, and confuse it into annexing the still-open file when one
process closes it.
I've been thinking about just running `lsof` on every file as it's being
annexed to check for that, but in the end, `lsof` is too slow. Since its
check involves trawling through all of /proc, it takes it a good half a
second to check a file, and adding 25 seconds to the time it takes to
process 100 files is just not acceptable.
But an option that could work is to run `lsof` after a bunch of new files
have been annexed. It can check a lot of files nearly as fast as a single
one. In the rare case that an annexed file is indeed still open, it could
be moved back out of the annex. Then when its remaining writer finally
closes it, another inotify event would re-annex it.

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://wiggy.net/"
nickname="Wichert"
subject="os compatibility"
date="2012-06-15T07:19:23Z"
content="""
A downside of relying on lsof is that you might be painting yourself into a linux corner: other operating systems might not have a lsof or alternative you can rely on. Especially for Windows this might be a worry.
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="http://dieter-be.myopenid.com/"
nickname="dieter"
subject="filesystem number of open file handles on a file"
date="2012-06-15T08:21:37Z"
content="""
wasn't there some filesystem functionality that could tell you the amount of open file handles on a certain file? I thought this was tracked per-file too.
Or maybe i'm just confusing it with the number of hard links (which stat can tell you), anyway something to look into.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkSq2FDpK2n66QRUxtqqdbyDuwgbQmUWus"
nickname="Jimmy"
subject="comment 3"
date="2012-06-15T08:58:17Z"
content="""
I would also be reluctant to use lsof for the sake of non-linux systems or systems that don't have lsof. I've only been playing around with the watch branch of my \"other\" laptop under archlinux. It looks usable, however I would prefer support for OSX before the watch branch gets merged to master ;)
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
nickname="Richard"
subject="comment 4"
date="2012-06-15T10:21:17Z"
content="""
Corner case, but if the other program finishes writing while you are annexing and your check shows no open files, you are left with bad checksum on a correct file. This \"broken\" file with propagate and the next round of fsck will show that all copies are \"bad\".
Without verifying if this is viable, could you set the file RO and thus block future writes before starting to annex?
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.6.135"
subject="comment 5"
date="2012-06-15T15:14:52Z"
content="""
@wichert All this inotify stuff is entirely linux specific AFAIK anyway, so it's find for workarounds to limitations in inotify functionality to also be linux specific.
@dieter I think you're thinking of hard links, filesystems don't track number of open file handles afaik.
@Jimmy, I'm planning to get watch going on freebsd (and hopefully that will also cover OSX), after merging it :)
@Richard, the file is set RO while it's being annexed, so any lsof would come after that point.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.6.135"
subject="comment 6"
date="2012-06-15T15:23:21Z"
content="""
But Rich is right, and I was thinking the same thing earlier this morning, that delaying the lsof allows the writer to change the file and exit, and only fsck can detect the problem then. Setting file permissions doesn't help once a process already has it open for write. Which has put me off the delayed lsof idea unfortunately. lsof *could* be run safely during the intial annexing.
"""]]

View file

@ -18,8 +18,20 @@ There is a `watch` branch in git that adds the command.
Possible fixes:
* Somehow track or detect if a file is open for write by any processes.
`lsof` could be used, although it would be a little slow, and not
avoid every possible race.
`lsof` could be used, although it would be a little slow.
Here's one way to avoid the slowdown: When a file is being added,
set it read-only, and hard-link it into a quarantine directory,
remembering both filenames.
Then use the batch change mode code to detect batch adds and bundle
them together.
Just before committing, lsof the quarantine directory. Any files in
it that are still open for write can just have their write bit turned
back on and be deleted from quarantine, to be handled when their writer
closes. Files that pass quarantine get added as usual. This avoids
repeated lsof calls slowing down adds, but does add a constant factor
overhead (0.25 seconds lsof call) before any add gets committed.
* Or, when possible, making a copy on write copy before adding the file
would avoid this.
* Or, as a last resort, make an expensive copy of the file and add that.

View file

@ -0,0 +1,4 @@
Is there an easy way to export annexed files out of the repository? (e.g. to make a copy elsewhere, send a file by email...)
Thanks,
Denis.

View file

@ -10,6 +10,7 @@
* [[NixOS]]
* [[Gentoo]]
* Windows: [[sorry, not possible yet|todo/windows_support]]
* [[ScientificLinux5]] - This should cover RHEL5 clones such as CentOS5 and so on
## Using cabal

View file

@ -0,0 +1,70 @@
I was waiting for my backups to be done hence this article, as I was using
_git-annex_ to manage my files and I decided I needed to have
git-annex on a SL5 based machine. SL5 is just an opensource
clone/recompile of RHEL5.
I haven't tried to install the newer versions of Haskell Platform and
GHC in a while on SL5 to install git-annex. But the last time I checked
when GHC7 was out, it was a pain to install GHC on SL5.
However I have discovered that someone has gone through the trouble of
packaging up GHC and Haskell Platform for RHEL based distros.
* <http://justhub.org/download> - Packaged GHC and Haskell Platform
RPM's for RHEL based systems.
I'm primarily interested in installing _git-annex_ on SL5 based
systems. The installation process goes as such...
First install GHC and Haskell Platform (you need root for these
following steps)
$ wget http://sherkin.justhub.org/el5/RPMS/x86_64/justhub-release-2.0-4.0.el5.x86_64.rpm
$ rpm -ivh justhub-release-2.0-4.0.el5.x86_64.rpm
$ yum install haskell
The RPM's don't place the files in /usr/bin, so you must add the
following to your .bashrc (from here on you don't need root if you
don't want things to be system wide)
$ export PATH=/usr/hs/bin:$PATH
On SL5 pcre is at version 6.6 which is far too old for one of the
dependancies that git-annex requires. Therefore the user must install
an updated version of _pcre_ either from source or another method, I
chose to install it from source and by hand into /usr/local
$ wget http://sourceforge.net/projects/pcre/files/pcre/8.30/pcre-8.30.tar.gz/download
$ tar zxvf pcre-8.30.tar.gz
$ cd pcre-8.30
$ ./configure
$ make && make install
Once the packages are installed and are in your execution path, using
cabal to configure and build git-annex just makes life easier, it
should install all the needed dependancies.
$ cabal update
$ cabal install pcre-light --extra-include-dirs=/usr/local/include
$ git clone git://git.kitenet.net/git-annex
$ cd git-annex
$ make git-annex.1
$ cabal configure
$ cabal build
$ cabal install
Or if you want to install it globallly for everyone (otherwise it will
get installed into $HOME/.cabal/bin)
$ cabal install --global
The above will take a while to compile and install the needed
dependancies. I would suggest any user who does should run the tests
that comes with git-annex to make sure everything is functioning as
expected.
I haven't had a chance or need to install git-annex on a SL6 based
system yet, but I would assume something similar to the above steps
would be required to do so.
The above is almost a cut and paste of <http://jcftang.github.com/2012/06/15/installing-git-annex-on-sl5/>, the above could probably be refined, it was what worked for me on SL5. Please feel free to re-edit and chop out or add useless bits of text in the above!

View file

@ -1,12 +0,0 @@
git-annex 3.20120430 released with [[!toggle text="these changes"]]
[[!toggleable text="""
* Fix use of annex.diskreserve config setting.
* Directory special remotes now check annex.diskreserve.
* Support git's core.sharedRepository configuration.
* Add annex.http-headers and annex.http-headers-command config
settings, to allow custom headers to be sent with all HTTP requests.
(Requested by the Internet Archive)
* uninit: Clear annex.uuid from .git/config. Closes: #[670639](http://bugs.debian.org/670639)
* Added shared cipher mode to encryptable special remotes. This option
avoids gpg key distribution, at the expense of flexability, and with
the requirement that all clones of the git repository be equally trusted."""]]

View file

@ -0,0 +1,6 @@
git-annex 3.20120614 released with [[!toggle text="these changes"]]
[[!toggleable text="""
* addurl: Was broken by a typo introduced 2 released ago, now fixed.
Closes: #[677576](http://bugs.debian.org/677576)
* Install man page when run by cabal, in a location where man will
find it, even when installing under $HOME. Thanks, Nathan Collins"""]]

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawmURXBzaYE1gmVc-X9eLAyDat_6rHPl670"
nickname="Bram"
subject="Error when installing from Hackage"
date="2012-06-15T17:39:45Z"
content="""
I get this error when trying to install (actually upgrade) from Hackage:
bram@falafel% cabal install git-annex
Resolving dependencies...
Downloading git-annex-3.20120614...
cabal: Error: some packages failed to install:
git-annex-3.20120614 failed while unpacking the package. The exception was:
user error (File in tar archive is not in the expected directory
\"git-annex-3.20120614\")
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.6.135"
subject="comment 2"
date="2012-06-15T18:21:03Z"
content="""
My, cabal is picky about the tarballs it will accept. Doesn't understand longlinks in tarballs. I've uploaded a new release.
"""]]

View file

@ -1,5 +1,5 @@
Name: git-annex
Version: 3.20120612
Version: 3.20120614
Cabal-Version: >= 1.8
License: GPL
Maintainer: Joey Hess <joey@kitenet.net>

View file

@ -1,4 +1,7 @@
#!/bin/bash
#!/bin/sh
#
# Workaround for `cabal sdist` requiring all included files to be listed
# in .cabal.
# Create target directory
sdist_dir=git-annex-$(grep '^Version:' git-annex.cabal | sed -re 's/Version: *//')
@ -6,8 +9,13 @@ mkdir --parents dist/$sdist_dir
find . \( -name .git -or -name dist -or -name cabal-dev \) -prune \
-or -not -name \\*.orig -not -type d -print \
| perl -ne 'print unless length >= 100' \
| perl -ne "print unless length >= 100 - length q{$sdist_dir}" \
| xargs cp --parents --target-directory dist/$sdist_dir
cd dist
tar -caf $sdist_dir.tar.gz $sdist_dir
# Check that tarball can be unpacked by cabal.
# It's picky about tar longlinks etc.
rm -rf $sdist_dir
cabal unpack $sdist_dir.tar.gz