Commit graph

189 commits

Author SHA1 Message Date
Joey Hess
b8f85f7a82 build watch on non-linux, just don't do anything 2012-06-06 22:49:32 -04:00
Joey Hess
c5b11561f0 handle running out of watch descriptors 2012-06-06 16:50:28 -04:00
Joey Hess
db8effb8f3 ignore .gitignore and .gitattributes 2012-06-06 15:50:12 -04:00
Joey Hess
993e6459a3 factor out nukeFile 2012-06-06 13:13:13 -04:00
Joey Hess
cbf16f1967 refactor 2012-06-04 19:43:29 -04:00
Joey Hess
5b4e5ce7e5 deletion
When a new file is annexed, a deletion event occurs when it's moved away
to be replaced by a symlink. Most of the time, there is no problimatic
race, because the same thread runs the add event as the deletion event.
So, once the symlink is in place, the deletion code won't run at all,
due to existing checks that a deleted file is really gone.

But there is a race at startup, as then the inotify thread is running
at the same time as the main thread, which does the initial tree walking
and annexing. It would be possible for the deletion inotify to run
in a perfect race with the addition, and remove the newly added symlink
from the git cache.

To solve this race, added event serialization via a MVar. We putMVar
before running each event, which blocks if an event is already running.
And when an event finishes (or crashes!), we takeMVar to free the lock.

Also, make rm -rf not spew warnings by passing --ignore-unmatch when
deleting directories.
2012-06-04 18:09:18 -04:00
Joey Hess
529a3721e1 refactor 2012-06-04 17:17:05 -04:00
Joey Hess
47f8f43715 workaround other part of moved directory problem
This fixes the scenario where:

* directory foo is moved away (and still watched)
* a new directory foo is made
* file (or directory) foo/bar is created
* the old directory's file (or directory) "bar" is deleted

We don't want a deletion event for foo/bar in this case.
2012-06-04 14:54:14 -04:00
Joey Hess
fa9d479fd1 add explict test that a closed file even is on a regular file
There are two reasons for this test. First, there could be a fifo or other
non-regular file that was closed.

Second, this test avoids ugliness when a subdirectory is moved out of the
top of the watch directory to elsewhere, and a file added to it.
Since the subdirectory has moved, the file won't be present under the
old location, and nothing will be done.

I cannot find a way to stop watching such directories, at least not without
a lot of pain. The inotify interface in Haskell doesn't make it easy to
stop watching a given subdirectory when it's moved out; it would require
keeping a map of all watch handles that is shared between threads.
This workaround avoids the problem in most cases; the only remaining case
being deletion of a file from a moved subdirectory.
2012-06-04 14:31:06 -04:00
Joey Hess
23dbff4b43 add events for symlink creation and directory removal
Improved the inotify code, so it will also notice directory removal
and symlink creation.

In the watch code, optimised away a stat of a file that's being added,
that's done by Command.Add.start. This is the reason symlink creation is
handled separately from file creation, since during initial tree walk
at startup, a stat was already done, and can be reused.
2012-06-04 13:22:56 -04:00
Joey Hess
3b09281b44 add dirContentsRecursive 2012-05-31 19:25:33 -04:00
Joey Hess
bb4f31a0ee Clean up handling of git directory and git worktree.
Baked into the code was an assumption that a repository's git directory
could be determined by adding ".git" to its work tree (or nothing for bare
repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are
used to separate the two.

This was attacked at the type level, by storing the gitdir and worktree
separately, so Nothing for the worktree means a bare repo.

A complication arose because we don't learn where a repository is bare
until its configuration is read. So another Location type handles
repositories that have not had their config read yet. I am not entirely
happy with this being a Location type, rather than representing them
entirely separate from the Git type. The new code is not worse than the
old, but better types could enforce more safety.

Added support for core.worktree. Overriding it with -c isn't supported
because it's not really clear what to do if a git repo's config is read, is
not bare, and is then overridden to bare. What is the right git directory
in this case? I will worry about this if/when someone has a use case for
overriding core.worktree with -c. (See Git.Config.updateLocation)

Also removed and renamed some functions like gitDir and workTree that
misused git's terminology.

One minor regression is known: git annex add in a bare repository does not
print a nice error message, but runs git ls-files in a way that fails
earlier with a less nice error message. This is because before --work-tree
was always passed to git commands, even in a bare repo, while now it's not.
2012-05-18 17:03:12 -04:00
Joey Hess
e36808e167 Pass -a to cp even when it supports --reflink=auto, to preserve permissions.
Amoung other things, this makes unlocking a WORM backed file and then
re-adding it without making any changes not add a new object, as the
timestamp is preserved.
2012-05-15 14:18:51 -04:00
Joey Hess
913775410b update 2012-05-02 11:43:30 -04:00
Joey Hess
5337c4e0c4 rsync protocol? 2012-05-02 11:38:27 -04:00
Joey Hess
0c9c14b52f percentage library 2012-04-29 17:48:07 -04:00
Joey Hess
1c16f616df Added shared cipher mode to encryptable special remotes.
This option avoids gpg key distribution, at the expense of flexability, and
with the requirement that all clones of the git repository be equally
trusted.
2012-04-29 14:02:43 -04:00
Joey Hess
f766774209 unbreak code inside ifdef 2012-04-22 11:22:20 -04:00
Joey Hess
42e4145a17 bugfixes 2012-04-22 01:20:17 -04:00
Joey Hess
84ac8c58db Add annex.httpheaders and annex.httpheader-command config settings
Allow custom headers to be sent with all HTTP requests.

(Requested by the Internet Archive)
2012-04-22 01:13:09 -04:00
Joey Hess
ed79596b75 noop 2012-04-21 23:32:33 -04:00
Joey Hess
bee420bd2d in which I discover void
void :: Functor f => f a -> f () -- ah, of course that's useful :)
2012-04-21 23:06:19 -04:00
Joey Hess
b98b69e8c6 honor core.sharedRepository when making all the other files in the annex
Lock files, directories, etc.
2012-04-21 19:36:03 -04:00
Joey Hess
7e45712d19 better file mode setting code 2012-04-21 16:01:56 -04:00
Joey Hess
b4a5e39ee6 Support git's core.sharedRepository configuration
This is incomplete, it does not honor it yet for hash directories
and other annex bookkeeping files. Some of that is not needed for a bare
repo; some of it may be.
2012-04-21 15:36:52 -04:00
Joey Hess
cbf9a9420e avoid chown call if the current file mode is same as new
Not only an optimisation. fsck always tried to preventWrite to make sure
file modes are good, and in a shared repo, that will fail on directories
not owned by the current user. Although there may be other problems with
such a setup.
2012-04-21 11:59:50 -04:00
Joey Hess
37061c019d tweak 2012-04-18 13:23:46 -04:00
Joey Hess
d75771b0ab clear errno after successful call
When preparing the debian stable backport, I am seeing a call to statvfs()
succeed, but also set errno to 2 (ENOENT). Not sure why this happens;
I am in a schroot when it does happen, or perhaps stable's libc is a little
broken and sets errno incorrectly. Anyway, it should be perfectly fine to
clear errno after the successful call, rather than before it.
2012-04-18 13:22:50 -04:00
Joey Hess
142bde13cd import juggling 2012-04-14 12:33:32 -04:00
Joey Hess
3642c72320 Renamed diskfree.c to avoid OSX case insensativity bug. 2012-04-13 11:26:39 -04:00
Joey Hess
6464a576cd allow add or del events to be ignored 2012-04-12 17:28:40 -04:00
Joey Hess
be4edbaaf1 allow excluding directories from being watched 2012-04-12 16:59:33 -04:00
Joey Hess
4ccc86669a don't fall over broken links 2012-04-12 16:46:57 -04:00
Joey Hess
d08a67a78e put in a warning so I remember to look at this 2012-04-12 15:48:59 -04:00
Joey Hess
1f34bf9acb add waitForTermination 2012-04-11 20:28:01 -04:00
Joey Hess
b133a76f96 recursive inotify thing
Recursive inotify has beaten me before, with its bad design and races,
but not this time! (I think.) This is able to follow the strongest
filesystem traffic I can throw at it, and robustly notices every file
add and delete. Mostly that's down to Haskell having a quite nice threaded
inotify library (that does its own buffering). A key insight was realizing
that the inotify directory add race could be dealt with by scanning for
files inside newly added directories.

TODO: Add support for freebsd/osx kqueue; see
http://hackage.haskell.org/package/kqueue

Can a git-annex-monitor be far off?
2012-04-11 20:09:38 -04:00
Joey Hess
62c69e7e25 Disable diskfree on kfreebsd, as I have a build failure on kfreebsd-i386 that is quite likely caused by it. 2012-04-07 15:50:34 -04:00
Joey Hess
42cc85a82f use struct statfs64 on OSX 2012-03-23 12:05:05 -04:00
Joey Hess
0f2052446f tweak 2012-03-22 23:02:20 -04:00
Joey Hess
6f514155f1 update 2012-03-22 23:02:20 -04:00
Joey Hess
d7d2fefbf2 break out kfreebsd
Now tested on linux and freebsd (amd64).
2012-03-22 17:32:47 -04:00
Joey Hess
e38a839a80 Rewrote free disk space checking code
Moving the portability handling into a small C library cleans up things
a lot, avoiding the pain of unpacking structs from inside haskell code.
2012-03-22 17:32:47 -04:00
Joey Hess
a362c46b70 fun with symbols
Nothing at all on hackage is using <&&> or <||>.

(Also, <&&> should short-circuit on failure.)
2012-03-17 00:38:40 -04:00
Joey Hess
771052a85e optimize monadic ||
(||) used applicative style runs both conditions rather than short
circuiting. Add an orM that properly short-circuits.
2012-03-16 12:28:17 -04:00
Joey Hess
b06336fa3a simplify 2012-03-16 02:12:56 -04:00
Joey Hess
c0c9991c9f nukes another 15 lines thanks to ifM 2012-03-15 20:39:25 -04:00
Joey Hess
60ab3d84e1 added ifM and nuked 11 lines of code
no behavior changes
2012-03-14 17:43:34 -04:00
Joey Hess
95a1f6b2ac check hook executability 2012-03-14 12:17:38 -04:00
Joey Hess
6fd0c0bfec move 2012-03-11 18:12:36 -04:00
Joey Hess
f9d44cccd9 perhaps more clear type 2012-03-10 11:38:38 -04:00
Joey Hess
10d9315b59 cleanup 2012-03-09 20:43:50 -04:00
Joey Hess
bca3fd65b9 fix key directory hash calculation code
Fix Key directory hash calculation code to behave as it did before version
3.20120227 when a key contains non-ascii.

The hash directories for a given Key are based on its md5sum.
Prior to ghc 7.4, Keys contained raw, undecoded bytes, so the md5sum was
taken of each byte in turn. With the ghc 7.4 filename encoding change,
keys contains decoded unicode characters (possibly with surrigates for
undecodable bytes). This changes the result of the md5sum, since the md5sum
used is pure haskell and supports unicode. And that won't do, as git-annex
will start looking in a different hash directory for the content of a key.

The surrigates are particularly bad, since that's essentially a ghc
implementation detail, so could change again at any time. Also, changing
the locale changes how the bytes are decoded, which can also change
the md5sum.

Symptoms would include things like:

* git annex fsck would complain that no copies existed of a file,
  despite its symlink pointing to the content that was locally present
* git annex fix would change the symlink to use the wrong hash
  directory.

Only WORM backend is likely to have been affected, since only it tends
to include much filename data (SHA1E could in theory also be affected).

I have not tried to support the hash directories used by git-annex versions
3.20120227 to 3.20120308, so things added with those versions with WORM
will require manual fixups. Sorry for the inconvenience!
2012-03-09 20:03:51 -04:00
Joey Hess
d6e77595ba factor out Utility.FileSystemEncoding 2012-03-09 19:08:10 -04:00
Joey Hess
789254747b refactor 2012-03-09 18:52:03 -04:00
Joey Hess
51338486dc Fix a bug in symlink calculation code, that triggered in rare cases where an annexed file is in a subdirectory that nearly matched to the .git/annex/object/xx/yy subdirectories.
This is a straight up pure-code stinker. The relative path calculation
looked for common subdirectories in the two paths, but failed to stop
after the paths diverged. When a later pair of subdirectories were the
same, the resulting relative path was wrong.

Added regression test for this.
2012-03-05 12:42:52 -04:00
Joey Hess
6c0155efb7 refactor 2012-02-20 15:22:21 -04:00
Joey Hess
a1e52f0ce5 hlint 2012-02-16 00:44:51 -04:00
Joey Hess
ecfcb41abe work around Network.Browser bug that converts a HEAD to a GET when following a redirect
The code explicitly switches from HEAD to GET for most redirects.
Possibly because someone misread a spec (which does require switching from
POST to GET for 303 redirects). Or possibly because the spec really is that
bad. Upstream bug: https://github.com/haskell/HTTP/issues/24

Since we absolutely don't want to download entire (large) files from
the web when checking that they exist with HEAD, I wrote my own redirect
follower, based closely on the one used by Network.Browser, but without
this misfeature.

Note that Network.Browser checks that the redirect url is a http url
and fails if not. I don't, because I want to not need to change this
code when it gets https support (related: I'm surprised to see it
doesn't support https yet..). The check does not seem security significant;
it doesn't support file:// urls for example. If a http url is redirected
to https, the Network.Browser will actually make a http connection again.
This could loop, but only up to 5 times.
2012-02-10 21:54:25 -04:00
Joey Hess
9030f68452 When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key.
If there's no Content-Length, or the key has no size, this check is not
done, but it should happen most of the time, and protect against web
content that has changed.
2012-02-10 19:23:41 -04:00
Joey Hess
56470ce3e5 really fix foreign C functions filename encodings
GHC should probably export withFilePath.
2012-02-04 14:30:28 -04:00
Joey Hess
f1c7dc1212 fix touch and statfs to work on any files in any locale
Use withCAString rather than withCString.

XXX Actually, this only works in non-unicode locales when presented with
unicode characters. Help?
2012-02-04 12:44:51 -04:00
Joey Hess
44b115e0b1 Merge branch 'master' into ghc7.4
Conflicts:
	Utility/Misc.hs
2012-02-03 16:48:40 -04:00
Joey Hess
146c36ca54 IO exception rework
ghc 7.4 comaplains about use of System.IO.Error to catch exceptions.
Ok, use Control.Exception, with variants specialized to only catch IO
exceptions.
2012-02-03 16:47:24 -04:00
Joey Hess
d8fb97806c support all filename encodings with ghc 7.4
Under ghc 7.4, this seems to be able to handle all filename encodings
again. Including filename encodings that do not match the LANG setting.
I think this will not work with earlier versions of ghc, it uses some ghc
internals.

Turns out that ghc 7.4 has a special filesystem encoding that it uses when
reading/writing filenames (as FilePaths). This encoding is documented
to allow  "arbitrary undecodable bytes to be round-tripped through it".

So, to get FilePaths from eg, git ls-files, set the Handle that is reading
from git to use this encoding. Then things basically just work.

However, I have not found a way to make Text read using this encoding.
Text really does assume unicode. So I had to switch back to using String
when reading/writing data to git. Which is a pity, because it's some
percent slower, but at least it works.

Note that stdout and stderr also have to be set to this encoding, or
printing out filenames that contain undecodable bytes causes a crash.
IMHO this is a misfeature in ghc, that the user can pass you a filename,
which you can readFile, etc, but that default, putStr of filename may
cause a crash!

Git.CheckAttr gave me special trouble, because the filenames I got back
from git, after feeding them in, had further encoding breakage.
Rather than try to deal with that, I just zip up the input filenames
with the attributes. Which must be returned in the same order queried
for this to work.

Also of note is an apparent GHC bug I worked around in Git.CheckAttr. It
used to forkProcess and feed git from the child process.  Unfortunatly,
after this forkProcess, accessing the `files` variable from the parent
returns []. Not the value that was passed into the function. This screams
of a bad bug, that's clobbering a variable, but for now I just avoid
forkProcess there to work around it. That forkProcess was itself only added
because of a ghc bug, #624389. I've confirmed that the test case for that
bug doesn't reproduce it with ghc 7.4. So that's ok, except for the new ghc
bug I have not isolated and reported. Why does this simple bit of code
magnet the ghc bugs? :)

Also, the symlink touching code is currently broken, when used on utf-8
filenames in a non-utf-8 locale, or probably on any filename containing
undecodable bytes, and I temporarily commented it out.
2012-02-03 16:23:20 -04:00
Joey Hess
a964012fc3 switch to the strict state monad
I had not realized what a memory leak the lazy state monad could be,
although I have not seen much evidence of actual leaking in git-annex.
However, if running git-annex on a great many files, this could matter.

The additional Utility.State.changeState adds even more strictness,
avoiding a problem I saw in github-backup where repeatedly modifying
state built up a huge pile of thunks.
2012-01-29 22:55:06 -04:00
Joey Hess
ce5637498f remove Utility.Conditional and use IfElse
This drops the >>! and >>? with the nice low fixity. IfElse does have
undocumented >>=>>! and >>=>>? operators, but I deem that too fishy.
Anyway, using whenM and unlessM is easier; I sometimes mixed the operators
up.
2012-01-24 16:22:07 -04:00
Joey Hess
ba6088b249 rename readMaybe to readish
a stricter (but also partial) readMaybe is getting added to base
2012-01-23 17:00:10 -04:00
Joey Hess
5e172b43c4 a few things available elsewhere... 2012-01-23 16:57:45 -04:00
Joey Hess
183bdacca2 treak 2012-01-21 02:49:32 -04:00
Joey Hess
25f998679c typo 2012-01-20 15:06:17 -04:00
Joey Hess
0eed604446 Add a sanity check for bad StatFS results.
git-annex FTBFS on s390, mips, powerpc, sparc. That StatFS code is failing
on all of them. At least on s390, the failure appears as:

Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 0,
fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0,
fsStatBytesUsed = 0})

While I don't understand why this is happening, or how to fix it,
bandaid over it by checking for obviously bad values and returning Nothing.
That disables disk free space checking, but at least git-annex will work.

Upstream bug: http://code.google.com/p/xmobar/issues/detail?id=70
2012-01-14 17:17:20 -04:00
Joey Hess
9ffd97442b don't use GPG_AGENT_INFO to force batch mode in test suite
Fails with gpg 2. Instead, use a different environment variable.

The clean fix would instead be to add an annex.gpg-options configuration.
But, that would be rather a lot of work and it's unlikely it would be
useful for much else.
2012-01-09 18:19:29 -04:00
Joey Hess
60c1aeeb6f Fix overbroad gpg --no-tty fix from last release.
Only set --no-tty when GPG_AGENT_INFO is set and batch mode is used.

In the test suite, set GPG_AGENT_INFO to /dev/null to avoid the test suite
relying on /dev/tty.
2012-01-07 12:38:08 -04:00
Joey Hess
769edd6b08 Run gpg with --no-tty. Closes: #654721 2012-01-05 13:44:09 -04:00
Joey Hess
1e929c022d typo 2012-01-03 18:40:47 -04:00
Joey Hess
ee554542c1 after is a better name for observe_ 2012-01-03 00:29:27 -04:00
Joey Hess
34abd7bca8 no implicit dotfiles in add
Dotfiles, and files inside dotdirs are not added by "git annex add" unless
the dotfile or directory is explicitly listed. So "git annex add ." will
add all untracked files in the current directory except for those in
dotdirs.

One reason for this is that it will make git-annex more usable with vcsh,
where you don't want "vcsh big annex add" to check in all the dotfiles
that are already versioned in other repositories.

(If you're using vcsh for repos that contain non-dotfiles, this won't help,
and you'll need to .gitignore such things, but this will cover the common
case.)

A more general reason why this seems like a good idea is the same reason ls
ignores dotfiles, just the unix convention that they are cruft that is kept
out of the way most of the time.

All the other git-annex commands still do deal with any dotfiles that do
get into the annex. This seemed right because if I've gone to the trouble
to add a dotfile, I will want "git annex get ." to get it along with
everything else.
2012-01-03 00:11:00 -04:00
Joey Hess
fc80b8d96b factor observe_ 2012-01-03 00:11:00 -04:00
Joey Hess
aa0882691b Added remote.name.annex-web-options configuration setting, which can be used to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same). 2012-01-02 14:20:20 -04:00
Joey Hess
442202dd6d add a new useful thing 2012-01-02 11:18:33 -04:00
Joey Hess
f015ef5fde cleanup 2011-12-23 01:08:19 -04:00
Joey Hess
7227dd8f21 add escape_var hack
Makes it easy to find files with duplicate contents, anyway.. :)
2011-12-23 01:08:19 -04:00
Joey Hess
db964e358f reorg 2011-12-23 01:08:18 -04:00
Joey Hess
cba3ce08df handle C-style escapes in Format
I was happily able to repurpose some code from Git.Filename to handle this.

I remember writing that code... a whole afternoon at a coffee shop, after
which I felt I'd struggled with Haskell and git, and sorta lost, in needing
to write this nasty peice of code. But was also pleased at the use of a
pair of functions and quickcheck that allowed me to get it 100% right.
So, turns out I not only got it right, but the code wasn't as special-purpose
as I'd feared. Yay!
2011-12-23 01:05:16 -04:00
Joey Hess
a0872a8ec3 better data type 2011-12-22 19:56:31 -04:00
Joey Hess
06bafae9e0 Format strings can be specified using the new --find option, to control what is output by git annex find. 2011-12-22 18:31:44 -04:00
Joey Hess
cf496f09ab add a text formatter
This is built for speed; a format string is parsed once, generating a
Format, that can be applied repeatedly to different sets of variables
to generate output.
2011-12-22 17:59:14 -04:00
Joey Hess
82a145df91 test encrypted special remote
This involved adding a test harness to run gpg with a dummy key, and lots
of fun.
2011-12-20 23:24:06 -04:00
Joey Hess
c11cfea355 split out Utility.Gpg with the generic gpg interface, from Crypto 2011-12-20 23:24:06 -04:00
Joey Hess
8e2f74f7ab update 2011-12-20 23:24:05 -04:00
Joey Hess
815d1318d7 comment 2011-12-20 23:24:05 -04:00
Joey Hess
dc5ed8d3b6 amusing name
This is both a partial Prelude that conflicts with the real one, and a
way to guard against the Prelude's partial functions.
2011-12-20 11:01:50 -04:00
Joey Hess
718a278f97 simplify 2011-12-15 22:19:05 -04:00
Joey Hess
9901fc04a0 move 2011-12-15 18:23:07 -04:00
Joey Hess
95d2391f58 more partial function removal
Left a few Prelude.head's in where it was checked not null and too hard to
remove, etc.
2011-12-15 18:19:36 -04:00
Joey Hess
817b1936e6 add beginning, end
Safe versions of init and last
2011-12-15 16:59:18 -04:00
Joey Hess
e7a555bf21 fix types 2011-12-15 15:39:33 -04:00
Joey Hess
2332afb4bc cleanup 2011-12-12 02:04:48 -04:00
Joey Hess
4b3c4c0c2b avoid some read 2011-12-09 18:57:09 -04:00
Joey Hess
28699c95a7 some work on avoiding partial functions
There are still hundreds of places that use partial functions head, tail,
init, and last.
2011-12-09 18:10:41 -04:00