Merge branch 'master' into cabal-man-pages

This commit is contained in:
Nathan Collins 2012-06-11 00:18:48 -07:00
commit 6eb4f35c03
6 changed files with 120 additions and 8 deletions

View file

@ -1,5 +1,5 @@
PREFIX=/usr
IGNORE=-ignore-package monads-fd
IGNORE=-ignore-package monads-fd -ignore-package monads-tf
BASEFLAGS=-Wall $(IGNORE) -outputdir tmp -IUtility -DWITH_S3
GHCFLAGS=-O2 $(BASEFLAGS)

View file

@ -0,0 +1,57 @@
After a few days otherwise engaged, back to work today.
My focus was on adding the committing thread mentioned in [[day_4__speed]].
I got rather further than expected!
First, I implemented a really dumb thread, that woke up once per second,
checked if any changes had been made, and committed them. Of course, this
rather sucked. In the middle of a large operation like untarring a tarball,
or `rm -r` of a large directory tree, it made lots of commits and made
things slow and ugly. This was not unexpected.
So next, I added some smarts to it. First, I wanted to stop it waking up
every second when there was nothing to do, and instead blocking wait on a
change occuring. Secondly, I wanted it to know when past changes happened,
so it could detect batch mode scenarios, and avoid committing too
frequently.
I played around with combinations of various Haskell thread communications
tools to get that information to the committer thread: `MVar`, `Chan`,
`QSem`, `QSemN`. Eventually, I realized all I needed was a simple channel
through which the timestamps of changes could be sent. However, `Chan`
wasn't quite suitable, and I had to add a dependency on
[Software Transactional Memory](http://en.wikipedia.org/wiki/Software_Transactional_Memory),
and use a `TChan`. Now I'm cooking with gas!
With that data channel available to the committer thread, it quickly got
some very nice smart behavior. Playing around with it, I find it commits
*instantly* when I'm making some random change that I'd want the
git-annex assistant to sync out instantly; and that its batch job detection
works pretty well too.
There's surely room for improvement, and I made this part of the code
be an entirely pure function, so it's really easy to change the strategy.
This part of the committer thread is so nice and clean, that here's the
current code, for your viewing pleasure:
[[!format haskell """
{- Decide if now is a good time to make a commit.
- Note that the list of change times has an undefined order.
-
- Current strategy: If there have been 10 commits within the past second,
- a batch activity is taking place, so wait for later.
-}
shouldCommit :: UTCTime -> [UTCTime] -> Bool
shouldCommit now changetimes
| len == 0 = False
| len > 4096 = True -- avoid bloating queue too much
| length (filter thisSecond changetimes) < 10 = True
| otherwise = False -- batch activity
where
len = length changetimes
thisSecond t = now `diffUTCTime` t <= 1
"""]]
Still some polishing to do to eliminate minor innefficiencies and deal
with more races, but this part of the git-annex assistant is now very usable,
and will be going out to my beta testers soon!

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkq0-zRhubO6kR9f85-5kALszIzxIokTUw"
nickname="James"
subject="Cloud Service Limitations"
date="2012-06-11T02:15:04Z"
content="""
Hey Joey!
I'm not very tech savvy, but here is my question.
I think for all cloud service providers, there is an upload limitation on how big one file may be.
For example, I can't upload a file bigger than 100 MB on box.net.
Does this affect git-annex at all? Will git-annex automatically split the file depending on the cloud provider or will I have to create small RAR archives of one large file to upload them?
Thanks!
James
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.153.8.126"
subject="re: cloud"
date="2012-06-11T04:48:08Z"
content="""
Yes, git-annex has to split files for certian providers. I already added support for this as part of my first pass at supporting box.com, see [[tips/using_box.com_as_a_special_remote]].
"""]]

View file

@ -23,10 +23,11 @@ really useful, it needs to:
is exceeded. This can be tuned by root, so help the user fix it.
**done**
- periodically auto-commit staged changes (avoid autocommitting when
lots of changes are coming in)
- tunable delays before adding new files, etc
- coleasce related add/rm events for speed and less disk IO
lots of changes are coming in) **done**
- coleasce related add/rm events for speed and less disk IO **done**
- don't annex `.gitignore` and `.gitattributes` files **done**
- run as a daemon **done**
- tunable delays before adding new files, etc
- configurable option to only annex files meeting certian size or
filename criteria
- option to check files not meeting annex criteria into git directly
@ -107,7 +108,3 @@ Many races need to be dealt with by this code. Here are some of them.
Not a problem; The removal event removes the old file from the index, and
the add event adds the new one.
* At startup, `git add --update` is run, to notice deleted files.
Then inotify starts up. Files deleted in between won't have their
removals staged.

View file

@ -0,0 +1,34 @@
This is a continuation of the conversation from [[the comments|design/assistant/#comment-77e54e7ebfbd944c370173014b535c91]] section in the design of git-assistant. In summary, I've setup an auto builder which should help [[Joey]] have an easier time developing on git-annex on non-linux/debian platforms. This builder is currently running on OSX 10.7 with the 64bit version of Haskell Platform.
The builder output can be found at <http://www.sgenomics.org/~jtang/gitbuilder-git-annex-x00-x86_64-apple-darwin10.8.0/>, the CGI on this site does not work as my OSX workstation is pushing the output from another location.
The builder currently tries to build all branches except
* debian-stable
* pristine-tar
* setup
It also does not build any of the tags as well, Joey had suggested to ignore the bpo named tags, but for now it's easier for me to not build any tags. To continue on this discussion, if anyone wants to setup a gitbuilder instance, here is the build.sh script that I am using.
<pre>
#!/bin/bash -x
# Macports
export PATH=/opt/local/bin:$PATH
# Haskell userland
export PATH=$PATH:$HOME/.cabal/bin
# Macports gnu
export PATH=/opt/local/libexec/gnubin:$PATH
make || exit 3
make -q test
if [ "$?" = 1 ]; then
# run "make test", but give it a time limit in case a test gets stuck
../maxtime 1800 make test || exit 4
fi
</pre>
It's also using the branches-local script for sorting and prioritising the branches to build, this branches-local script can be found at the [autobuild-ceph](https://github.com/ceph/autobuild-ceph/blob/master/branches-local) repository. If there are other people interested in setting up their own instances of gitbuilder for git-annex, please let me know and I will setup an aggregator page to collect status of the builds. The builder runs and updates the webpage every 30mins.