Commit graph

818 commits

Author SHA1 Message Date
Joey Hess
bf72760af2 dead: Remove dead repository from all groups.
This is less expensive than having inallgroup weed out dead repositories.
2012-10-10 15:39:13 -04:00
Joey Hess
9da7dd8874 webapp: configure new repos to use the standard preferred content settings 2012-10-10 15:35:10 -04:00
Joey Hess
b6ce003843 rename --ingroup to --inallgroup 2012-10-10 12:59:45 -04:00
Joey Hess
558a69d34d releasing version 3.20121009 2012-10-09 15:43:36 -04:00
Joey Hess
a5781fd9ba webapp automatic grouping
webapp: Adds newly created repositories to one of these groups:
clients, drives, servers

This is heuristic, but it's a pretty good heuristic, and can always be
configured.
2012-10-09 14:24:17 -04:00
Joey Hess
5ac15149cc assistant: Now honors preferred content settings when deciding what to transfer.
Both when queueing downloads, and uploads, consults the preferred content
settings.

I didn't make it check yet when requeing failed transfers or queuing
deferred downloads; dealing with the preferred content settings (or indeed,
other settings) changing while the assistant is running still needs work.
2012-10-09 12:18:41 -04:00
Joey Hess
a3c9b16195 simplify changelog 2012-10-08 16:14:55 -04:00
Joey Hess
17543f6e80 drop --auto --from with preferred content
With --from, it needs to examine the preferred content of the repository
being dropped from, instead of the local repository.
2012-10-08 15:34:44 -04:00
Joey Hess
e375b931c0 add --ingroup limit 2012-10-08 15:18:58 -04:00
Joey Hess
7cd81bd978 Added --smallerthan and --largerthan limits 2012-10-08 13:39:18 -04:00
Joey Hess
71fd18a97f wired preferred content up to get, copy, and drop --auto 2012-10-08 13:16:53 -04:00
Joey Hess
18c9de5e14 Merge branch 'master' into safesemaphore
Conflicts:
	debian/changelog
2012-10-07 17:36:58 -04:00
Joey Hess
34e7faf71a uninit: Unset annex.version. Closes: #689852 2012-10-07 16:04:03 -04:00
Joey Hess
33a2af36f2 Depend on and use the Haskell SafeSemaphore library, which provides exception-safe versions of SampleVar and QSemN. Thanks, Ben Gamari for an excellent patch set. 2012-10-05 17:50:17 -04:00
Joey Hess
2b0423e13f Only build-depend on libghc-clientsession-dev on arches that will have the webapp. 2012-10-04 17:08:43 -04:00
Joey Hess
7a7f63182c vicfg: New command, allows editing (or simply viewing) most of the repository configuration settings stored in the git-annex branch.
Incomplete; I need to finish parsing and saving. This will also be used
for editing transfer control expresssions.

Removed the group display from the status output, I didn't really
like that format, and vicfg can be used to see as well as edit rempository
group membership.
2012-10-03 17:04:52 -04:00
Joey Hess
9aab70de66 always check with ls-files before adding new files
Makes it safe to use git annex unlock with the watcher/assistant.
And also to mix use of the watcher/assistant with regular files stored in git.

Long ago, I had avoided doing this check, except during the startup scan,
because it would be slow to run ls-files repeatedly.

But then I added the lsof check, and to make that fast, got it to detect
batch file adds. So let's move the ls-files check to also occur when it'll
have a batch, and can check them all with one call.

This does slow down adding a single file by just a bit, but really only
a little bit. (The lsof check is probably more expensive.) It also
speeds up the startup scan, especially when there are lots of new files
found by the scan.

Also, fixed the sleep for annex.delayadd to not run while the threadstate
lock is held, so it doesn't unnecessarily freeze everything else.

Also, --force no longer makes it skip the lsof check, which was not
documented, and seems never a good idea.
2012-10-02 17:41:23 -04:00
Joey Hess
eeaa8dada8 A way to match files in repositories in a group
--copies=group:number can now be used to match files that are present in a
specified number of repositories in a group.
2012-10-01 18:25:11 -04:00
Joey Hess
2a96b1aab3 group, ungroup: New commands to indicate groups of repositories. 2012-10-01 15:12:04 -04:00
Joey Hess
e0432bc140 releasing version 3.20121001 2012-10-01 14:12:31 -04:00
Joey Hess
0ea56761a9 typo 2012-10-01 13:50:45 -04:00
Joey Hess
5849c3f24b Avoid building the webapp on Debian architectures that do not yet have template haskell and thus yesod. (Should be available for arm soonish I hope). 2012-09-29 01:28:02 -04:00
Joey Hess
67c04a443e reorg 2012-09-28 16:08:01 -04:00
Joey Hess
1117583087 The Makefile now builds with the new yesod by default.
Systems like Debian that have the old yesod 1.0.1 should set
GIT_ANNEX_LOCAL_FEATURES=-DWITH_OLD_YESOD
2012-09-28 15:59:06 -04:00
Joey Hess
087781fb05 Always do a system wide installation when DESTDIR is set. Closes: #689052 2012-09-28 15:48:00 -04:00
Joey Hess
7f78bc92b6 webapp: Avoid crashing when ssh-keygen -F chokes on an invalid known_hosts file. 2012-09-27 11:27:16 -04:00
Joey Hess
17708dd173 add a configurator for S3 2012-09-26 14:44:07 -04:00
Joey Hess
e4bf74a965 store S3 creds in a 600 mode file inside the local git repo 2012-09-26 14:42:32 -04:00
Joey Hess
926ffaf3f3 Fix fallback to ~/Desktop when xdg-user-dir is not available. Closes: #688833
Really the fix here is to make Utility.Process only throw IOErrors,
which is what I naturally assumed it'd throw.
2012-09-25 22:48:17 -04:00
Joey Hess
84d431a679 rename option 2012-09-25 19:43:33 -04:00
Joey Hess
3e297e99a3 fsck: New --incremental-restart option which is nice for scheduling eg, monthly incremental fsck runs in cron jobs. 2012-09-25 19:37:34 -04:00
Joey Hess
f0e0d17440 New --time-limit option, makes long git-annex commands stop after a specified amount of time. 2012-09-25 16:48:24 -04:00
Joey Hess
ec65584c53 changelog 2012-09-25 15:10:35 -04:00
Joey Hess
bc83179a76 Test that uuid -m works, falling back to plain uuid if not. 2012-09-25 10:48:20 -04:00
Joey Hess
40df26757a copy: avoid updating location log when no copy is performed
git annex copy --to remote often does not need to copy a file,
but it was still updating the location log in this case.
2012-09-24 19:58:34 -04:00
Joey Hess
300a4ebade releasing version 3.20120924 2012-09-24 15:20:28 -04:00
Joey Hess
d77ff5dadd changelog and minor cleanup to fix mixed spaces/tabs 2012-09-23 15:42:05 -04:00
Joey Hess
ee8789e9d7 changelog updates 2012-09-21 21:37:31 -04:00
Joey Hess
601ee470af sync: Pushes the git-annex branch to remote/synced/git-annex, rather than directly to remote/git-annex.
This fixes a problem I was seeing in the assistant where two remotes would
attempt to sync with one another at the same time, and both failed pushing
the diverged git-annex branch. Then when both tried to resolve the failed
push, they each modified their git-annex branch, which again each blocked
the other from pushing into it. The result was that the git-annex
branches were perpetually diverged (despite having the same content!) and
once the assistant fell into this trap, it couldn't get out and always
had to do the slow push/fail/pull/merge/push/fail cycle.
2012-09-16 17:54:12 -04:00
Joey Hess
0b12db64d8 Avoid crashing on encoding errors in filenames when writing transfer info files and reading from checksum commands. 2012-09-16 01:53:06 -04:00
Joey Hess
48fd1e629c reinject: When the provided file doesn't match, leave it where it is, rather than moving to .git/annex/bad/ 2012-09-16 01:17:48 -04:00
Joey Hess
da63b7e96c Support repositories created with --separate-git-dir. Closes: #684405 2012-09-15 22:40:04 -04:00
Joey Hess
7f45baee5e migrate: Check content before generating the new key, to avoid generating a key for corrupt data. 2012-09-14 00:18:18 -04:00
Joey Hess
5573911d25 Disable ssh connection caching if the path to the control socket would be too long (and use relative path to minimise path to the control socket). 2012-09-13 19:26:39 -04:00
Joey Hess
3724344461 SHA256E is new default backend
The default backend used when adding files to the annex is changed from
SHA256 to SHA256E, to simplify interoperability with OSX, media players,
and various programs that needlessly look at symlink targets.

To get old behavior, add a .gitattributes containing: * annex.backend=SHA256
2012-09-12 13:22:16 -04:00
Joey Hess
d9d16622b9 test: Set a lot of git environment variables so testing works in strange environments that normally need git config to set names, etc. Closes: #682351 Thanks, gregor herrmann 2012-09-06 15:06:48 -04:00
Joey Hess
b12db9ef92 Merge branch 'master' into assistant
Conflicts:
	debian/changelog

Updated changelog for assistant and webapp
2012-08-27 13:31:54 -04:00
Joey Hess
0ef7028077 releasing version 3.20120825 2012-08-25 10:27:59 -04:00
Joey Hess
b985e0b7ec Bugfix: Fix fsck in SHA*E backends, when the key contains composite extensions, as added in 3.20120721. 2012-08-24 12:17:21 -04:00
Joey Hess
1f83dafc7e Bugfix: Fix fsck in SHA*E backends, when the key contains composite extensions, as added in 3.20120721. 2012-08-24 12:16:17 -04:00
Joey Hess
dcd208513d Merge branch 'master' into assistant
Conflicts:
	debian/changelog
2012-08-17 08:22:43 -07:00
Joey Hess
fe8fee235b Pass --use-agent to gpg when in no tty mode. Thanks, Eskild Hustvedt. 2012-08-17 08:22:11 -07:00
Joey Hess
cbca93cf7c Merge branch 'master' into assistant
Conflicts:
	debian/changelog
2012-08-16 16:36:32 -07:00
Joey Hess
2e1f3a86ae Merge branch 'master' into assistant
Conflicts:
	debian/changelog
2012-08-09 14:03:40 -04:00
Joey Hess
ad4e152fd6 S3: Add fileprefix setting. 2012-08-09 13:54:54 -04:00
Joey Hess
d99abc1255 releasing version 3.20120807 2012-08-07 13:49:58 -04:00
Joey Hess
7e2d07484f Merge branch 'master' into assistant 2012-08-07 13:31:43 -04:00
Joey Hess
2a9077f4e9 fix transfer log cleanup crash
Avoid crashing when "git annex get" fails to download from one location,
and falls back to downloading from a second location.

The problem is that git annex get calls download recursively from within
itself if the first download attempt fails. So the first time through, it
writes a transfer info file, which is then overwritten on the second,
recursive call. Then on cleanup, it tries to delete the file twice, which
of course doesn't work.

Fixed both by not crashing if the transfer file is removed, and by
changing Get to not run download recursively like that. It's the only
thing that did so, and it just seems like a bad idea.
2012-08-07 13:30:08 -04:00
Joey Hess
0833eb43a6 Merge remote-tracking branch 'origin/master' into assistant
Conflicts:
	Init.hs
2012-08-05 15:06:44 -04:00
Joey Hess
b885c0c6c8 unused, status: Avoid crashing when ran in bare repo. 2012-08-05 15:01:26 -04:00
Joey Hess
0ca85a9428 Revert "init: If no description is provided for a new repository, one will automatically be generated, like "joey@gnu:~/foo""
This reverts commit abde98cda2.

Temporarily dropping from master, since this actually uses stuff
that's only currently availble in the assistant branch. Will come back when
I merge that, and can wait..
2012-08-03 23:51:49 -04:00
Joey Hess
abde98cda2 init: If no description is provided for a new repository, one will automatically be generated, like "joey@gnu:~/foo" 2012-08-03 10:45:18 -04:00
Joey Hess
13e9b275dd initremote: Avoid recording remote's description before checking that its config is valid. 2012-07-27 21:05:27 -04:00
Joey Hess
b902a2960c releasing version 3.20120721 2012-07-21 17:01:19 -04:00
Joey Hess
f5f8879471 map: Write map.dot to .git/annex, which avoids watch trying to annex it. 2012-07-17 12:27:06 -04:00
Joey Hess
5a753a7b8a SHAnE backends are now smarter about composite extensions, such as .tar.gz Closes: #680450 2012-07-05 16:24:02 -06:00
Joey Hess
40729e7fa2 Use SHA library for files less than 50 kb in size, at which point it's faster than forking the more optimised external program. 2012-07-04 13:04:01 -04:00
Joey Hess
1da79ea61f When shaNsum commands cannot be found, use the Haskell SHA library (already a dependency) to do the checksumming. This may be slower, but avoids portability problems.
Using Crypto's version of the hashes would be another option.
I need to benchmark it. The SHA2 library (which provides SHA1 also,
confusing name) may be the fastest option, but is not currently in Debian.
2012-07-04 09:11:36 -04:00
Joey Hess
760e028dca pass associatedfile and remoteuuid to git-annex-shell
This *almost* works.

Along the way, I noticed that the --uuid parameter was being accidentially
passed after the --, so that has never been actually used by
git-annex-shell to verify it's running in the expected repository. Oops. Fixed.
2012-07-02 10:57:51 -04:00
Joey Hess
7225c2bfc0 record transfer information on local git remotes
In order to record a semi-useful filename associated with the key,
this required plumbing the filename all the way through to the remotes'
storeKey and retrieveKeyFile.

Note that there is potential for deadlock here, narrowly avoided.
Suppose the repos are A and B. A sends file foo to B, and at the same
time, B gets file foo from A. So, A locks its upload transfer info file,
and then locks B's download transfer info file. At the same time,
B is taking the two locks in the opposite order. This is only not a
deadlock because the lock code does not wait, and aborts. So one of A or
B's transfers will be aborted and the other transfer will continue.
Whew!
2012-07-01 17:15:11 -04:00
Joey Hess
e5fd8b67b7 get, move, copy: Now refuse to do anything when the requested file transfer is already in progress by another process.
Note this is per-remote, so trying to get the same file from multiple
remotes can still let duplicate downloads run. (And uploading the same file
to multiple remotes is not duplicate at all of course.)

get, move, and copy are the only git-annex subcommands that transfer
files, but there's still git-annex-shell recvkey and sendkey to deal with too.

I considered modifying retrieveKeyFile or getViaTmp, but they are called
by other code that does not involve expensive file transfers (migrate)
or that does file transfers that should not be checked by this (fsck --from).
2012-07-01 17:15:11 -04:00
Joey Hess
61786c52ad releasing version 3.20120629 2012-06-29 14:03:03 -04:00
Joey Hess
048b64024a sync: Automatically resolves merge conflicts.
untested, but it compiles :)
2012-06-27 13:08:32 -04:00
Joey Hess
6f45827fe0 git-config fileEncoding
Accept arbitrarily encoded repository filepaths etc when reading git config
output. This fixes support for remotes with unusual characters in their
names.

For example, a remote with a url of /tmp/çüş was previously
skipped, because the filename wasn't encoded right so it didn't think it
was available. And when setting the annex-uuid of a remote named "çüş",
it used to add it under a mis-encoded form of the remote's name. Both these
cases now work ok in my testing.
2012-06-26 23:07:11 -04:00
Joey Hess
1093d82f6b Got rid of the last place that did utf8 decoding.
Probably fixes bugs/git-annex:_Cannot_decode_byte___39____92__xfc__39__/
although I don't know how to reproduce that bug.
2012-06-26 22:58:44 -04:00
Joey Hess
7e62e57f8c Avoid ugly failure mode when moving content from a local repository that is not available.
Prelude.undefined error message was introduced by
bb4f31a0ee.

It seems best to filter out local repositories that cannot be accessed
from the list of remotes, rather than keeping them in and making every
thing that uses the list have to deal with remotes that may have an unknown
location.

Besides fixing the error message, this also makes unavailable local
remotes' names not be shown in various messages, including in git annex
status output.

Also, move --to an unavailable local repository now avoids some ugly
errors like "changeWorkingDirectory: does not exist".
2012-06-26 17:22:44 -04:00
Joey Hess
41fcb3d852 Version build dependency on STM, and allow building without it, which disables the watch command. 2012-06-26 09:15:47 -04:00
Joey Hess
cede7bdcde cabal: Only try to use inotify on Linux. 2012-06-25 11:38:42 -04:00
Joey Hess
a0952dd0f9 releasing version 3.20120624 2012-06-24 12:51:18 -04:00
Joey Hess
c79e3b67e9 sync: Avoid recent git's interactive merge. 2012-06-23 10:22:56 -04:00
Joey Hess
88e26046d7 typo 2012-06-20 15:27:54 -04:00
Joey Hess
483b1b08c6 Merge branch 'master' into watch 2012-06-20 13:15:59 -04:00
Joey Hess
dfccee2616 unused: Fix crash when file names contain invalid utf8.
Was decoding the git-cat-file of the symlink target as utf8, but that can't
do, unix filenames are from the 70's and need this shiny disco
fileSystemEncoding.
2012-06-20 12:57:00 -04:00
Joey Hess
7a09d74319 lifted out the kqueue and inotify to a generic DirWatcher interface
Kqueue code for dispatching events is not tested and probably doesn't
build.
2012-06-18 23:49:07 -04:00
Joey Hess
66344a3613 Enable diskfree on kfreebsd, using statvfs.
Could not reproduce the build failure I had seen related to this,
but the numbers were wrong with statfs64. Probably pulling from the wrong
place in the structure. statvfs seems to work..
2012-06-17 18:10:57 -04:00
Joey Hess
53d2e81ffd Merge branch 'master' into watch 2012-06-15 15:20:11 -04:00
Joey Hess
8492f1c182 releasing version 3.20120614 2012-06-14 20:32:06 -04:00
Joey Hess
ca9d94a0ad addurl: Was broken by a typo introduced 2 released ago, now fixed. Closes: #677576 2012-06-14 20:20:03 -04:00
Joey Hess
2e5ea30981 Merge branch 'master' into watch
Conflicts:
	debian/changelog
	git-annex.cabal
2012-06-12 13:37:17 -04:00
Joey Hess
0e944fd0e9 Install man page when run by cabal, in a location where man will find it, even when installing under $HOME. Thanks, Nathan Collins 2012-06-12 11:36:42 -04:00
Joey Hess
0847a300fc Revert "Build with ghc's threaded runtime, so threaded code does not busy-wait."
This reverts commit 129f6123fe.

Saw hang during batch add with -threaded, so deferred for now.
2012-06-11 12:46:35 -04:00
Joey Hess
129f6123fe Build with ghc's threaded runtime, so threaded code does not busy-wait.
Sort of a work around for http://bugs.debian.org/677096
2012-06-11 12:21:18 -04:00
Joey Hess
a5a3cd55ac Merge branch 'master' into watch
Conflicts:
	debian/changelog
2012-06-11 12:13:07 -04:00
Joey Hess
7f70767bfb uninit: Refuse to run in a subdirectory. Closes: #677076 2012-06-11 10:33:58 -04:00
Joey Hess
727158ff55 Merge branch 'master' into watch 2012-06-07 13:48:55 -04:00
Joey Hess
4d1c114e4d initremote: Automatically describe a remote when creating it.
This ensures that all special remotes show up in git annex status.
Before, a special remote that was not manually described, and was not
a current git remote, did not show up there, although initremote did list
it.
2012-06-07 11:16:48 -04:00
Joey Hess
c56812980c document watch 2012-06-06 23:28:33 -04:00
Joey Hess
c981ccc077 add: Prevent (most) modifications from being made to a file while it is being added to the annex.
Anything that tries to open the file for write, or delete the file,
or replace it with something else, will not affect the add.

Only if a process has the file open for write before add starts
can it still change it while (or after) it's added to the annex.
(fsck will catch this later of course)
2012-06-05 20:28:34 -04:00
Joey Hess
8511957c68 releasing version 3.20120605 2012-06-05 14:14:45 -04:00
Joey Hess
13118136c0 Preserve parent environment when running hooks of the hook special remote. 2012-06-04 21:52:36 -04:00
Joey Hess
2183fd2abd Require that the SHA256 backend can be used when building, since it's the default. 2012-05-31 23:15:40 -04:00
Joey Hess
6fd83851c1 Fix display of warning message when encountering a file that uses an unsupported backend. 2012-05-31 21:03:24 -04:00
Joey Hess
3a10095d40 import: New subcommand, pulls files from a directory outside the annex and adds them
Use case for this was developed somewhere on the Transiberian Railroad.
2012-05-31 19:47:18 -04:00
Joey Hess
65977a5584 lock: Reset unlocked file to index, rather than to branch head.
Resetting an unlocked file to the branch head failed if it had just been
added, not committed, and unlocked, since the branch didbn't have it.

The code was concerned about dropping any changes that might be staged in the
index, but I cannot see why.
2012-05-30 17:01:22 -04:00
Joey Hess
6e213d04f1 sync: Show a nicer message if a user tries to sync to a special remote. 2012-05-27 20:55:56 -04:00
Joey Hess
ab07762ddb releasing version 3.20120522 2012-05-22 11:27:22 -04:00
Joey Hess
eb6cb1b87f Add support for core.worktree, and fix support for GIT_WORK_TREE and GIT_DIR.
The environment needs to override git-config. Changed when git config is
read, and avoid rereading it once it's been read.

chdir for both worktree settings.
2012-05-18 18:20:53 -04:00
Joey Hess
bb4f31a0ee Clean up handling of git directory and git worktree.
Baked into the code was an assumption that a repository's git directory
could be determined by adding ".git" to its work tree (or nothing for bare
repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are
used to separate the two.

This was attacked at the type level, by storing the gitdir and worktree
separately, so Nothing for the worktree means a bare repo.

A complication arose because we don't learn where a repository is bare
until its configuration is read. So another Location type handles
repositories that have not had their config read yet. I am not entirely
happy with this being a Location type, rather than representing them
entirely separate from the Git type. The new code is not worse than the
old, but better types could enforce more safety.

Added support for core.worktree. Overriding it with -c isn't supported
because it's not really clear what to do if a git repo's config is read, is
not bare, and is then overridden to bare. What is the right git directory
in this case? I will worry about this if/when someone has a use case for
overriding core.worktree with -c. (See Git.Config.updateLocation)

Also removed and renamed some functions like gitDir and workTree that
misused git's terminology.

One minor regression is known: git annex add in a bare repository does not
print a nice error message, but runs git ls-files in a way that fails
earlier with a less nice error message. This is because before --work-tree
was always passed to git commands, even in a bare repo, while now it's not.
2012-05-18 17:03:12 -04:00
Joey Hess
e36808e167 Pass -a to cp even when it supports --reflink=auto, to preserve permissions.
Amoung other things, this makes unlocking a WORM backed file and then
re-adding it without making any changes not add a new object, as the
timestamp is preserved.
2012-05-15 14:18:51 -04:00
Joey Hess
61a5df33d4 releasing version 3.20120511 2012-05-11 12:37:26 -04:00
Joey Hess
bbfa74e7ac format 2012-05-07 13:19:00 -04:00
Joey Hess
f7d8982672 Fix use of several config settings
annex.ssh-options, annex.rsync-options, annex.bup-split-options.

And adjust types to avoid the bugs that broke several config settings
recently. Now "annex." prefixing is enforced at the type level.
2012-05-05 20:16:56 -04:00
Joey Hess
392931eca9 addunused: New command, the opposite of dropunused, it relinks unused content into the git repository. 2012-05-02 14:59:05 -04:00
Joey Hess
8f45300479 dropunused: Allow specifying ranges to drop.
Sort of by popular demand, but the last straw for not using seq
was that it can run into command line length limits.
2012-05-02 13:15:19 -04:00
Joey Hess
6d61067599 rsync shellescape disable option
Rsync special remotes can be configured with shellescape=no to avoid shell
quoting that is normally done when using rsync over ssh. This is known to
be needed for certian rsync hosting providers (specificially
hidrive.strato.com) that use rsync over ssh but do not pass it through the
shell.
2012-05-02 13:08:33 -04:00
Joey Hess
76b80d6af0 releasing version 3.20120430 2012-04-30 13:59:28 -04:00
Joey Hess
1c16f616df Added shared cipher mode to encryptable special remotes.
This option avoids gpg key distribution, at the expense of flexability, and
with the requirement that all clones of the git repository be equally
trusted.
2012-04-29 14:02:43 -04:00
Joey Hess
e0b7012ccc uninit: Clear annex.uuid from .git/config. Closes: #670639 2012-04-27 12:21:38 -04:00
Joey Hess
1db09af14c fix names 2012-04-22 11:42:38 -04:00
Joey Hess
84ac8c58db Add annex.httpheaders and annex.httpheader-command config settings
Allow custom headers to be sent with all HTTP requests.

(Requested by the Internet Archive)
2012-04-22 01:13:09 -04:00
Joey Hess
b4a5e39ee6 Support git's core.sharedRepository configuration
This is incomplete, it does not honor it yet for hash directories
and other annex bookkeeping files. Some of that is not needed for a bare
repo; some of it may be.
2012-04-21 15:36:52 -04:00
Joey Hess
5cc76098ca Directory special remotes now check annex.diskreserve. 2012-04-20 16:24:44 -04:00
Joey Hess
e807502666 had the wrong name for this 2012-04-20 16:14:29 -04:00
Joey Hess
840315c350 releasing version 3.20120418 2012-04-18 12:22:22 -04:00
Joey Hess
626697b459 cabal file now autodetects whether S3 support is available. 2012-04-14 14:22:33 -04:00
Joey Hess
1ca41044e8 cabal now installs git-annex-shell as a symlink to git-annex. 2012-04-14 14:01:14 -04:00
Joey Hess
3642c72320 Renamed diskfree.c to avoid OSX case insensativity bug. 2012-04-13 11:26:39 -04:00
Joey Hess
52a158a7c6 autocorrection
git-annex (but not git-annex-shell) supports the git help.autocorrect
configuration setting, doing fuzzy matching using the restricted
Damerau-Levenshtein edit distance, just as git does. This adds a build
dependency on the haskell edit-distance library.
2012-04-12 15:37:21 -04:00
Joey Hess
c924542e61 bup: Properly handle key names with spaces or other things that are not legal git refs.
Continue using the key name as bup ref name, to preserve backwards
compatability, unless it is an illegal git ref. In that case, use a sha256
of the key name instead.
2012-04-11 12:45:49 -04:00
Joey Hess
182778d664 bugfix: Adding a dotfile also caused all non-dotfiles to be added.
When only a dotfile was specified, the list of non-dotfiles was empty,
triggering the fallback behavior of finding all files.
2012-04-08 12:25:54 -04:00
Joey Hess
29acf62ba3 releasing version 3.20120406 2012-04-07 15:58:13 -04:00
Joey Hess
62c69e7e25 Disable diskfree on kfreebsd, as I have a build failure on kfreebsd-i386 that is quite likely caused by it. 2012-04-07 15:50:34 -04:00
Joey Hess
16acc507f3 releasing version 3.20120405 2012-04-05 16:37:44 -04:00
Joey Hess
a398db7885 update 2012-03-24 11:58:22 -04:00
Joey Hess
e38a839a80 Rewrote free disk space checking code
Moving the portability handling into a small C library cleans up things
a lot, avoiding the pain of unpacking structs from inside haskell code.
2012-03-22 17:32:47 -04:00
Joey Hess
188e2edc41 status: Prints available local disk space, or shows if git-annex doesn't know. 2012-03-21 21:55:02 -04:00
Joey Hess
181d2ccd20 Improve detection of inability to check free disk space.
Don't check if configure indicated checks won't work. This should fix a
FTBFS on mipsel, where configure correctly detects the checks won't work,
while garbage is returned for disk space info at git-annex runtime. It also
means that, when built via cabal, disk space checks are not enabled,
unfortunatly.
2012-03-21 21:21:20 -04:00
Joey Hess
d2769cf795 shave some 12 mb from the installed size
* git-annex now behaves as git-annex-shell if symlinked to and run by that
  name. The Makefile sets this up, saving some 8 mb of installed size.
* git-union-merge is a demo program, so it is no longer built by default.
2012-03-15 12:00:19 -04:00
Joey Hess
a4f72c9625 update 2012-03-14 12:44:17 -04:00
Joey Hess
342fc28437 Merge branch 'master' into bloom
Conflicts:
	Command/Commit.hs
	debian/changelog
2012-03-14 12:41:48 -04:00
Joey Hess
5b869eef91 git-annex-shell: Runs hooks/annex-content after content is received or dropped. 2012-03-14 12:18:10 -04:00
Joey Hess
caf97fcffd git-annex-shell: Runs hooks/annex-content after content is received or dropped. 2012-03-14 12:01:56 -04:00
Joey Hess
b27760aa68 Work around a bug in rsync (IMHO) introduced by openSUSE's SIP patch.
openSUSE patches rsync with a patch adding SIP protocol support.
https://gist.github.com/2026167

With this patch, running rsync with no hostname parameter is apparently
supposed to list SIP hosts on the network. Practically, it does nothing
and exits 0.

git-annex uses rsync in a very special way to allow git-annex-shell to be
run on the remote host, and so did not need to specify a hostname, or a
file to transfer as a rsync parameter. So it sent ":", a degenerate case of
"host:file".

But the patch cannot differentiate ":" with no host parameter
(a bug in the SIP patch surely).

Results were that getting files failed, as rsync seemed to succeed, but the
requested file failed to arrive. Also I think that sending files will
make git-annex think a file has been transferred to the remote when
really rsync does nothing.

The workaround for this buggy rsync patch is to use "dummy:" as the
hostname.
2012-03-12 22:53:43 -04:00
Joey Hess
94aff8b878 Merge branch 'master' into bloom
Conflicts:
	debian/changelog
2012-03-12 16:32:29 -04:00
Joey Hess
25809ce2e0 finish bloom filters
Add tuning, docs, etc.

Not sure if status is the right place to remote size.. perhaps unused
should report the size and also warn if it sees more keys than the bloom
filter allows?
2012-03-12 16:18:35 -04:00
Joey Hess
89ee70c43a status: More accurate display of sizes of tmp and bad keys.
Can't trust the key size to be accurate for tmp and bad keys, so check
actual file size. In the wild I saw the old code be wrong by a factor
of about 100!

If all tmp/bad keys are empty, they're not shown in status at all.
Showing 0 bytes and suggesting to clean it up seemed weird..
2012-03-12 00:41:48 -04:00
Joey Hess
b325694645 getKeysPresent is now fully lazy
.. Allowing it to be used by things in constant space!

Random statistics: git annex status has gone from taking 239 mb
of memory and 26 seconds in a repo, to 8 mb and 13 seconds.

The trick here is the unsafeInterleaveIO, and the form of the function's
recursion, which I cribbed heavily from System.IO.HVFS.Utils.recurseDirStat.
The difference is, this one goes to a limited depth and avoids statting
everything.
2012-03-11 18:04:58 -04:00
Joey Hess
ff3644ad38 status: Fixed to run in nearly constant space.
Before, it leaked space due to caching lists of keys. Now all necessary
data about keys is calculated as they stream in.

The "nearly constant" is due to getKeysPresent, which builds up a lot
of [] thunks as it traverses .git/annex/objects/. Will deal with it later.
2012-03-11 17:15:58 -04:00
Joey Hess
b086e32c63 unused: Reduce memory usage significantly.
Much of the memory bloat turned out to be due to getKeysReferenced
containing a mapM, which is strict and buffered the whole list
rather than streaming it.

The other half of the bloat was due to building a temporary Set
in order to call S.difference. While that is more cpu efficient,
I switched to successive S.delete, since with it, I can run a whole
git annex unused in less than 8 mb of memory.

The whole Set of keys with content available is still stored in memory,
so running unused in a repo with a whole lot of file content will still
use more memory. In a repo containing 6000 files, it needed 40 mb.

Note that the status command still uses the bloatful getKeysReferenced.
2012-03-11 16:24:07 -04:00
Joey Hess
997e29f294 sync: Sync to lower cost remotes first.
This has two benefits.

1. When a lot of refs are going to be received, get them via lower cost
   connection when possible.
2. Allows ctrl-c of sync after the cheaper remotes have been pulled from
   (or pushed to).
2012-03-10 15:37:38 -04:00
Joey Hess
5ab82230f7 fsck: Fix up any broken links and misplaced content caused by the directory hash calculation bug fixed in the last release. 2012-03-10 14:46:21 -04:00
Joey Hess
433b5fe59e releasing version 3.20120309 2012-03-09 20:14:34 -04:00
Joey Hess
bca3fd65b9 fix key directory hash calculation code
Fix Key directory hash calculation code to behave as it did before version
3.20120227 when a key contains non-ascii.

The hash directories for a given Key are based on its md5sum.
Prior to ghc 7.4, Keys contained raw, undecoded bytes, so the md5sum was
taken of each byte in turn. With the ghc 7.4 filename encoding change,
keys contains decoded unicode characters (possibly with surrigates for
undecodable bytes). This changes the result of the md5sum, since the md5sum
used is pure haskell and supports unicode. And that won't do, as git-annex
will start looking in a different hash directory for the content of a key.

The surrigates are particularly bad, since that's essentially a ghc
implementation detail, so could change again at any time. Also, changing
the locale changes how the bytes are decoded, which can also change
the md5sum.

Symptoms would include things like:

* git annex fsck would complain that no copies existed of a file,
  despite its symlink pointing to the content that was locally present
* git annex fix would change the symlink to use the wrong hash
  directory.

Only WORM backend is likely to have been affected, since only it tends
to include much filename data (SHA1E could in theory also be affected).

I have not tried to support the hash directories used by git-annex versions
3.20120227 to 3.20120308, so things added with those versions with WORM
will require manual fixups. Sorry for the inconvenience!
2012-03-09 20:03:51 -04:00
Joey Hess
0d41899304 releasing version 3.20120230 2012-03-05 13:47:20 -04:00
Joey Hess
51338486dc Fix a bug in symlink calculation code, that triggered in rare cases where an annexed file is in a subdirectory that nearly matched to the .git/annex/object/xx/yy subdirectories.
This is a straight up pure-code stinker. The relative path calculation
looked for common subdirectories in the two paths, but failed to stop
after the paths diverged. When a later pair of subdirectories were the
same, the resulting relative path was wrong.

Added regression test for this.
2012-03-05 12:42:52 -04:00
Joey Hess
52e88f3ebf add remote start and stop hooks
Locking is used, so that, if there are multiple git-annex processes
using a remote concurrently, the stop hook is only run by the last
process that uses it.
2012-03-04 19:12:58 -04:00
Joey Hess
9856c24a59 Add progress bar display to the directory special remote.
So far I've only written progress bars for sending files, not yet
receiving.

No longer uses external cp at all. ByteString IO is fast enough.
2012-03-04 03:17:25 -04:00
Joey Hess
3436aba6de Directory special remotes now support chunking files written to them
Avoiding writing files larger than a specified size is useful on certian
things. For example, box.com has a file size limit of 100 mb. Could also
be useful on really crappy removable media.
2012-03-03 18:05:55 -04:00
Joey Hess
1098bc37ab "here" can be used to refer to the current repository, which can read better than the old "." (which still works too). 2012-03-01 22:35:10 -04:00
Joey Hess
6571831b92 releasing version 3.20120229 2012-02-29 02:39:44 -04:00
Joey Hess
e5fee3f352 Fix test suite to not require a unicode locale.
Without a unicode locale, it will fail to print a unicode filename to
console, and fails.
2012-02-29 02:32:05 -04:00
Joey Hess
8cae4115a8 releasing version 3.20120227 2012-02-27 13:07:04 -04:00
Joey Hess
2fd294d06f move --from, copy --from: 10 times faster scanning remote on local disk
Rather than go through the location log to see which files are present on
the remote, it simply looks at the disk contents directly.

I benchmarked this speeding up scanning 834 files, from an annex on my
phone's SSD, from 11.39 seconds to 1.31 seconds. (No files actually moved.)

Also benchmarked 8139 files, from an annex on spinning storage,
speeding up from 103.17 to 13.39 seconds.

Note that benchmarking with an encrypted annex on flash actually showed a
minor slowdown with this optimisation -- from 13.93 to 14.50 seconds. Seems
the overhead of doing the crypto needed to get the filenames to directly
check can be higher than the overhead of looking up data in the location
log. (Which says good things about how well the location log and git have
been optimised!) It *may* make sense to make encrypted local remotes not
have hasKeyCheap set; further benchmarking is called for.
2012-02-26 14:59:48 -04:00
Joey Hess
12b89a3eb8 configure: Check if ssh connection caching is supported by the installed version of ssh and default annex.sshcaching accordingly. 2012-02-25 19:15:29 -04:00
Joey Hess
c3fbe07d7a do a cleanup commit after moving data from or to a git remote
Added Annex.cleanup, which is a general purpose interface for adding
actions to run at the end.

Remotes with the old git-annex-shell will commit every time, and have no
commit command, so hide stderr when running the commit command.
2012-02-25 18:02:49 -04:00
Joey Hess
1f73db3469 improve alwayscommit=false mode
Now changes are staged into the branch's index, but not committed,
which avoids growing a large journal. And sync and merge always
explicitly commit, ensuring that even when they do nothing else,
they commit the staged changes.

Added a flag file to indicate that the branch's journal contains
uncommitted changes. (Could use git ls-files, but don't want to run
that every time.)

In the future, this ability to have uncommitted changes staged in the
journal might be used on remotes after a series of oneshot commands.
2012-02-25 16:18:55 -04:00
Joey Hess
b49c0c2633 add annex.alwayscommit option
To avoid commits of data to the git-annex branch after each command
is run, set annex.alwayscommit=false. Its data will then be committed
less frequently, when a merge or sync is done.
2012-02-25 15:31:42 -04:00
Joey Hess
bd66f962d3 Deal with NFS problem that caused a failure to remove a directory when removing content from the annex.
I was able to reproduce this on linux using the kernel's nfs server and
mounting localhost:/. Determined that removing the directory fails when
the just-deleted file in it was locked. Considered dropping the lock
before removing the directory, but this would complicate parts of the code
that should not need to worry about locking. So instead, ignore the failure
to remove the directory in this case.

While I was at it, made it attempt to remove both levels of hash
directories, in case they're empty.
2012-02-24 16:30:47 -04:00
Joey Hess
5bf07b3b5c Store web special remote url info in a more efficient location.
storing it in remotes/web/xx/yy/foo.log meant lots of extra directory
objects in git. Now I use xx/yy/foo.log.web, which is just as unique, but
more efficient since foo.log is there anyway.

Of course, it still looks in the old location too.
2012-02-17 23:15:29 -04:00
Joey Hess
db6b4cdfcf rekey: New plumbing level command, can be used to change the keys used for files en masse. 2012-02-16 16:36:35 -04:00
Joey Hess
aeaaa0ff87 reorder 2012-02-16 15:07:59 -04:00
Joey Hess
39c3f56b33 addurl: Add --pathdepth option. 2012-02-16 12:25:19 -04:00
Joey Hess
63152428e9 changelog 2012-02-15 17:33:21 -04:00
Joey Hess
52c5b164d8 Added a annex.queuesize setting
useful when adding hundreds of thousands of files on a system with plenty
of memory.

git add gets quite slow in such a large repository, so if the system has
more than the ~32 mb of memory the queue can use by default, it's a useful
optimisation to increase the queue size, in order to decrease the number
of times git add is run.
2012-02-15 11:14:19 -04:00
Joey Hess
7ebd98d8d8 fix memory leak when staging the journal
The list of files had to be retained until the end so it could be deleted.
Also, a list of update-index lines was generated and only then fed into it.
Now everything streams in constant space.
2012-02-14 14:37:59 -04:00
Joey Hess
a40ec5e03e Fixed a memory leak due to excessive strictness when committing journal files.
When hashing the files, the entire list of shas was read strictly.
That was entirely unnecessary, since there's a cleanup action run
after they're consumed.
2012-02-14 11:20:34 -04:00
Joey Hess
cb631ce518 whereis: Prints the urls of files that the web special remote knows about. 2012-02-14 03:49:48 -04:00
Joey Hess
59b2adea4f changelog for a964012fc3
Turns out that commit really made some serious improvements to memory use.
With the lazy state monad, git-annex add in a huge tree grew seemingly
without bound until it overflowed the stack. With the strict monad,
it uses 42 mb max.

It's possible another change since the 3.20120123 release fixed that,
but a964012fc3 seems most likely.
2012-02-13 16:58:58 -04:00
Joey Hess
17fed709c8 addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key. 2012-02-10 19:23:46 -04:00
Joey Hess
9030f68452 When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key.
If there's no Content-Length, or the key has no size, this check is not
done, but it should happen most of the time, and protect against web
content that has changed.
2012-02-10 19:23:41 -04:00
Joey Hess
d55f3c0716 Fix teardown of stale cached ssh connections. 2012-02-09 21:49:46 -04:00
Joey Hess
1c0bd81ba6 addurl: Normalize badly encoded urls. 2012-02-09 14:19:58 -04:00
Joey Hess
ef013506cb addurl: Added a --file option
Can be used to specify what file the url is added to. This can be used to
override the default filename that is used when adding an url, which is
based on the url. Or, when the file already exists, the url is recorded as
another location of the file.
2012-02-08 15:35:29 -04:00
Joey Hess
57a747d081 S3: Fix irrefutable pattern failure when accessing encrypted S3 credentials. 2012-02-08 11:41:15 -04:00
Joey Hess
995bf51e10 correction 2012-02-07 16:52:39 -04:00
Joey Hess
3f4f96228e changelog 2012-02-06 20:42:49 -04:00
Joey Hess
b81d662cbf Avoid repeated location log commits when a remote is receiving files.
Done by adding a oneshot mode, in which location log changes are written to
the journal, but not committed. Taking advantage of git-annex's existing
ability to recover in this situation.

This is used by git-annex-shell and other places where changes are made to
a remote's location log.
2012-01-28 15:41:52 -04:00
Joey Hess
ce5637498f remove Utility.Conditional and use IfElse
This drops the >>! and >>? with the nice low fixity. IfElse does have
undocumented >>=>>! and >>=>>? operators, but I deem that too fishy.
Anyway, using whenM and unlessM is easier; I sometimes mixed the operators
up.
2012-01-24 16:22:07 -04:00
Joey Hess
20d0288802 releasing version 3.20120123 2012-01-23 15:09:50 -04:00
Joey Hess
47250a153a ssh connection caching
Ssh connection caching is now enabled automatically by git-annex. Only one
ssh connection is made to each host per git-annex run, which can speed some
things up a lot, as well as avoiding repeated password prompts. Concurrent
git-annex processes also share ssh connections. Cached ssh connections are
shut down when git-annex exits.

Note: The rsync special remote does not yet participate in the ssh
connection caching.
2012-01-20 17:14:56 -04:00
Joey Hess
61dbad505d fsck --from remote --fast
Avoids expensive file transfers, at the expense of checking file size
and/or contents.

Required some reworking of the remote code.
2012-01-20 13:23:11 -04:00
Joey Hess
90319afa41 fsck --from
Fscking a remote is now supported. It's done by retrieving
the contents of the specified files from the remote, and checking them,
so can be an expensive operation.

(Several optimisations are possible, to speed it up, of course.. This is
the slow and stupid remote fsck to start with.)

Still, if the remote is a special remote, or a git repository that you
cannot run fsck in locally, it's nice to have the ability to fsck it.

If you have any directory special remotes, now would be a good time to
fsck them, in case you were hit by the data loss bug fixed in the
previous release!
2012-01-19 15:24:05 -04:00
Joey Hess
2837e8fef1 releasing version 3.20120116 2012-01-16 16:52:26 -04:00
Joey Hess
f161b5eb59 Fix data loss bug in directory special remote
When moving a file to the remote failed, and partially transferred content
was left behind in the directory, re-running the same move would think it
succeeded and delete the local copy.

I reproduced data loss when moving files to a partition that was almost
full. Interrupting a transfer could have similar results.

Easily fixed by using a temp file which is then moved atomically into place
once the transfer completes.

I've audited other calls to copyFileExternal, and other special remote
file transfer code; everything else seems to use temp files correctly
(rsync, git), or otherwise use atomic transfers (bup, S3).
2012-01-16 16:28:15 -04:00
Joey Hess
ce608303a3 releasing version 3.20120115 2012-01-15 14:02:32 -04:00
Joey Hess
37b5b1bf0d Fix QuickCheck dependency in cabal file. 2012-01-15 13:53:51 -04:00
Joey Hess
81856c3175 add a configure check for StatFS
This way, the build log will indicate whether StatFS can be relied on.
I've tested all the failing architectures now, and on all of them,
the StatFS code now returns Nothing, rather than Just nonsense.

Also, if annex.diskreserve is set on a platform where StatFS is not
working, git-annex will complain.

Also, the Makefile was missing the sources target used when building with
cabal.
2012-01-15 13:49:32 -04:00
Joey Hess
0eed604446 Add a sanity check for bad StatFS results.
git-annex FTBFS on s390, mips, powerpc, sparc. That StatFS code is failing
on all of them. At least on s390, the failure appears as:

Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 0,
fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0,
fsStatBytesUsed = 0})

While I don't understand why this is happening, or how to fix it,
bandaid over it by checking for obviously bad values and returning Nothing.
That disables disk free space checking, but at least git-annex will work.

Upstream bug: http://code.google.com/p/xmobar/issues/detail?id=70
2012-01-14 17:17:20 -04:00
Joey Hess
b88ecbdc1b Add libghc-testpack-dev to build depends on all arches. 2012-01-13 15:50:56 -04:00
Joey Hess
1ae780ee79 git-annex, git-union-merge: Support GIT_DIR and GIT_WORK_TREE.
Note that GIT_WORK_TREE cannot influence GIT_DIR; that is necessary for
git-fake-bare and vcsh type things to work.
2012-01-13 12:52:09 -04:00