Commit graph

298 commits

Author SHA1 Message Date
Joey Hess
1d2984441c add a few tweaks to make it easy to use the Internet Archive's variant of S3
In particular, munge key filenames to comply with the IA's filename limits,
disable encryption, support their nonstandard way of creating buckets, and
allow x-amz-* headers to be specified in initremote to set item metadata.

Still TODO: initremote does not handle multiword metadata headers right.
2011-05-16 11:20:35 -04:00
Joey Hess
078a6fbd76 Work around a bug in Network.URI's handling of bracketed ipv6 addresses. 2011-05-06 15:21:30 -04:00
Joey Hess
86d3205061 releasing version 0.20110503 2011-05-03 21:49:20 -04:00
Joey Hess
1f84c7a964 S3: When encryption is enabled, the Amazon S3 login credentials are stored, encrypted, in .git-annex/remotes.log, so environment variables need not be set after the remote is initialized. 2011-05-01 14:05:10 -04:00
Joey Hess
43f0a666f0 unused: Now also lists files fsck places in .git/annex/bad/ 2011-04-29 13:59:00 -04:00
Joey Hess
eef3f634e9 Avoid crashing when an existing key is readded to the annex. 2011-04-28 20:41:40 -04:00
Joey Hess
07576f2a2c documentation for hook special remotes
Releasing before I have quite finished the code. Got a little caught
up in Anathem references. Time for a walk and then a tiny bit more coding
and possibly testing.
2011-04-28 15:26:21 -04:00
Joey Hess
d7b330b33b Fix hasKeyCheap setting for bup and rsync special remotes. 2011-04-28 14:39:51 -04:00
Joey Hess
84e1ebfb0e erm, thought I committed this release? 2011-04-28 14:38:01 -04:00
Joey Hess
7a33803193 Avoid pipeline stall when running git annex drop or fsck on a lot of files.
When it's stalled, there are 3 processes:

git annex
  git ls-files
  git check-attr

git-annex stalls trying to write to git check-attr, which stalls trying to
write to stdout (read by git-annex).

git ls-files does not seem to be involved directly; I've seen the stall when
it was still streaming out the file list, and after it had exited and
zombified.

The read and write are supposed to be handled by two different threads,
which pipeBoth forks off, thus avoiding deadlock. But it does deadlock.
(Certian signals unblock the deadlock for a while, then it stalls again.)

So, this is another case of WTF is the ghc IO manager doing today?
I avoid the issue by converting the writer to a separate process.

Possibly this was caused by some change in ghc 7 -- I'm offline and cannot
verify now, but I'm sure I used to be able to run git annex drop w/o it
hanging! And the code does not seem to have changed, except for commit
c1dc407941, which I tried reverting without
success. In fact, I reverted all the way back to 0.20110316 and still
saw the stall.

Update: Minimal test case:

import System.Cmd.Utils

main = do
	as <- checkAttr "blah" $ map show [1..100000]
	sequence $ map (putStrLn . show) as

checkAttr attr files = do
	(_, s) <- pipeBoth "git" params $ unlines files
	return $ lines s
	where
		params = ["check-attr", attr, "--stdin"]

Bug filed on ghc in debian, #624389
2011-04-27 23:18:35 -04:00
Joey Hess
39966ba4ee filter out --delete rsync option
rsync does not have a --no-delete, so do it this way instead
2011-04-27 20:31:56 -04:00
Joey Hess
e68f128a9b rsync special remote
Fully tested and working, including resuming and encryption. (Though not
resuming when sending *with* encryption; gpg doesn't produce identical
output each time.)

Uses same layout as the directory special remote and the .git/annex/objects/
directory.
2011-04-27 20:23:09 -04:00
Joey Hess
27774bdd56 Revert "Use haskell Crypto library instead of haskell SHA library.a"
This reverts commit 892593c5ef.

Conflicts:

	Crypto.hs
	debian/control
2011-04-26 11:24:23 -04:00
Joey Hess
7d71f8770b releasing version 0.20110425 2011-04-25 16:02:57 -04:00
Joey Hess
76911a446a Avoid using absolute paths when staging location log, as that can confuse git when a remote's path contains a symlink. Closes: #621386
This was a real PITA to fix, since location logs can be staged in
both the current repo, as well as in local remote's repos, in
which case the cwd will not be in the repo. And git add needs different
params in both cases, when absolute paths are not used.

In passing, git annex fsck now stages location log fixes.
2011-04-25 14:54:24 -04:00
Joey Hess
8512a4a1a1 Remove testpack from build depends, as it is not available on all architectures.
The test suite will not be run if it cannot be compiled.

It may be possible later to split off the quickcheck using tests into
a separate program and keep most of the tests using just hunit.
2011-04-25 12:43:22 -04:00
Joey Hess
892593c5ef Use haskell Crypto library instead of haskell SHA library.a
Since hS3 needs Crypto anyway, this actually reduces dependencies.
2011-04-21 16:37:14 -04:00
Joey Hess
24feee25c9 releasing version 0.20110420 2011-04-21 15:11:51 -04:00
Joey Hess
2467c56771 update on S3 memory leaks
The remaining leaks are in hS3. The leak with encryption was worked around
by the use of the temp file. (And was probably originally caused by
gpgCipherHandle sparking a thread which kept a reference to the start
of the byte string.)
2011-04-21 11:06:29 -04:00
Joey Hess
6fcd3e1ef7 fix S3 upload buffering problem
Provide file size to new version of hS3.
2011-04-21 10:33:17 -04:00
Joey Hess
43639f69f6 ghc7
* Update Debian build dependencies for ghc 7.
* Debian package is now built with S3 support. Thanks Joachim Breitner for
  making this possible, also thanks Greg Heartsfield for working to improve
  the hS3 library for git-annex.

Also hid a conflicting new symbol from Control.Monad.State
2011-04-21 02:22:40 -04:00
Joey Hess
143fc7b692 finalize release 2011-04-19 21:40:21 -04:00
Joey Hess
5985acdfad bup: Avoid memory leak when transferring encrypted data.
This was a most surprising leak. It occurred in the process that is forked
off to feed data to gpg. That process was passed a lazy ByteString of
input, and ghc seemed to not GC the ByteString as it was lazily read
and consumed, so memory slowly leaked as the file was read and passed
through gpg to bup.

To fix it, I simply changed the feeder to take an IO action that returns
the lazy bytestring, and fed the result directly to hPut.

AFAICS, this should change nothing WRT buffering. But somehow it makes
ghc's GC do the right thing. Probably I triggered some weakness in ghc's
GC (version 6.12.1).

(Note that S3 still has this leak, and others too. Fixing it will involve
another dance with the type system.)

Update: One theory I have is that this has something to do with
the forking of the feeder process. Perhaps, when the ByteString
is produced before the fork, ghc decides it need to hold a pointer
to the start of it, for some reason -- maybe it doesn't realize that
it is only used in the forked process.
2011-04-19 15:27:03 -04:00
Joey Hess
a441e08da1 Fix stalls in S3 when transferring encrypted data.
Stalls were caused by code that did approximatly:

content' <- liftIO $ withEncryptedContent cipher content return
store content'

The return evaluated without actually reading content from S3,
and so the cleanup code began waiting on gpg to exit before
gpg could send all its data.

Fixing it involved moving the `store` type action into the IO monad:

liftIO $ withEncryptedContent cipher content store

Which was a bit of a pain to do, thank you type system, but
avoids the problem as now the whole content is consumed, and
stored, before cleanup.
2011-04-19 14:45:19 -04:00
Joey Hess
a91a51fc03 Add missing build dep on dataenc. 2011-04-17 14:41:24 -04:00
Joey Hess
7aa668f4b4 Don't run gpg in batch mode, so it can prompt for passphrase when there is no agent. 2011-04-17 14:30:22 -04:00
Joey Hess
36f048979f releasing version 0.20110417 2011-04-17 12:43:36 -04:00
Joey Hess
1247bfeaa7 gpg recommended 2011-04-16 19:13:05 -04:00
Joey Hess
44c65f40b7 bup is now supported as a special type of remote. 2011-04-08 16:44:43 -04:00
Joey Hess
e2404ca409 refactor away whichCmd and some other cleanup 2011-04-07 22:03:31 -04:00
Joey Hess
b889543507 let's use Maybe String for commands that may not be avilable 2011-04-07 21:47:56 -04:00
Joey Hess
bc51387e6d Periodically flush git command queue, to avoid boating memory usage too much.
Since the queue is flushed in between subcommand actions being run,
there should be no issues with actions that expect to queue up some stuff
and have it run after they do other stuff. So I didn't have to audit for
such assumptions.
2011-04-07 13:59:31 -04:00
Joey Hess
ab0e03498f Add doc-base file. Closes: #621408 2011-04-06 21:57:22 -04:00
Joey Hess
c1bbe43422 Add build depend on perlmagick so docs are consistently built. Closes: #621410 2011-04-06 21:53:06 -04:00
Joey Hess
216ad1a4d3 Clear up short option confusion between --from and --force (-f is now --from, and there is no short option for --force). 2011-04-03 12:18:38 -04:00
Joey Hess
868300d4c1 unused/dropunused: support --from 2011-04-02 21:35:02 -04:00
Joey Hess
616e6f8a84 Use lowercase hash directories for locationlog files
to avoid some issues with git on OSX with the mixed-case directories. No
migration is needed; the old mixed case hash directories are still read;
new information is written to the new directories.
2011-04-02 13:49:03 -04:00
Joey Hess
1283ef73f8 releasing version 0.20110401 2011-04-01 21:31:37 -04:00
Joey Hess
ed7fc4fce9 Bugfix: copy --to --fast never really copied, fixed. 2011-04-01 12:34:06 -04:00
Joey Hess
a47ed922e1 add Remote.Directory 2011-03-30 13:24:36 -04:00
Joey Hess
9c96d86502 nasty hack to build when hS3 is not available
So, it would be nicer to just use Cabal and take advantage
of its conditional compilation support. But, Cabal seems to
lack good support for a package with an internal library that is used by
multiple executables. It wants to build everything twice or more.
That's too slow for me.

Anyway, fairly soon, I expect to upgrade hS3 to a requirment, and I
can just revert this.
2011-03-30 01:32:05 -04:00
Joey Hess
43bdebbc2d update 2011-03-29 18:24:26 -04:00
Joey Hess
996e5eee01 Merge branch 'master' into s3
Conflicts:
	debian/changelog
2011-03-28 16:34:58 -04:00
Joey Hess
0956f0dd15 fsck: Ensure that files and directories in .git/annex/objects have proper permissions. 2011-03-28 16:19:20 -04:00
Joey Hess
3162a724f1 S3 updates; gpg keys 2011-03-28 13:48:17 -04:00
Joey Hess
c5fc4f3d2a Merge branch 'master' into s3
Conflicts:
	debian/changelog
2011-03-28 13:20:58 -04:00
Joey Hess
1b6927995d releasing version 0.20110328 2011-03-28 11:12:32 -04:00
Joey Hess
016eea0280 Bugfix: Keys could be received into v1 annexes from v2 annexes, via v1 git-annex-shell. This results in some oddly named keys in the v1 annex. Recognise and fix those keys when upgrading, instead of crashing. 2011-03-28 09:27:28 -04:00
Joey Hess
1878745a46 more s3 docs 2011-03-28 02:13:26 -04:00
Joey Hess
a7bd63eb01 basic s3 remote start
But bucket name is not handled right; it needs to be globally unique.
2011-03-28 01:32:47 -04:00
Joey Hess
4868b64868 Provide a less expensive version of git annex copy --to, enabled via --fast. This assumes that location tracking information is correct, rather than contacting the remote for every file. 2011-03-27 18:34:30 -04:00
Joey Hess
f8693facab doc update 2011-03-27 17:30:44 -04:00
Joey Hess
8bcdf42b99 annex.diskreserve can be given in arbitrary units (ie "0.5 gigabytes") 2011-03-26 14:37:39 -04:00
Joey Hess
bc80ace96b releasing version 0.20110325 2011-03-25 00:51:12 -04:00
Joey Hess
03fdd0d56e dropunused: Significantly sped up; only read unused log file once. 2011-03-23 23:47:02 -04:00
Joey Hess
6246b807f7 migrate: Support migrating v1 SHA keys to v2 SHA keys with size information that can be used for free space checking. 2011-03-23 17:57:10 -04:00
Joey Hess
8beb72e206 migrate: Bugfix for case when migrating a file results in a key that is already present in .git/annex/objects.
For example, this could happen if using SHA1 and a file with content
"foo" were added to that backend. Then a file with "content" foo were
migrated from the WORM backend.

Assume that, if a backend assigned the same key, the already annexed
content must be the same. So, the "old" content can be reused.
2011-03-23 17:25:28 -04:00
Joey Hess
af45a62980 update 2011-03-23 13:13:51 -04:00
Joey Hess
7400c8318a correct 2011-03-23 12:46:51 -04:00
Joey Hess
7051763b5b tweak 2011-03-22 21:00:18 -04:00
Joey Hess
c1dc407941 Fix space leak in fsck and drop commands.
The space leak was somehow caused by this line:

	absfiles <- mapM absPath files

I confess, I don't quite understand why this caused bad buffering,
but apparently the whole pipeline from git-ls-files backed up at that
point.

Happily, rewriting the code to only get the cwd once and use a pure
function to calculate absfiles clears it up, and should be a little more
efficient in syscalls too.
2011-03-22 20:31:22 -04:00
Joey Hess
5d75919561 update 2011-03-22 18:55:29 -04:00
Joey Hess
368e20eb84 diskreserve setting
Add annex.diskreserve config setting, to control how much free space to
reserve for other purposes and avoid using (defaults to 1 mb).
2011-03-22 17:53:40 -04:00
Joey Hess
c21998722c fast mode
Add --fast flag, that can enable less expensive, but also less thurough versions of some commands.

* Add --fast flag, that can enable less expensive, but also less thurough
  versions of some commands.
* fsck: In fast mode, avoid checking checksums.
* unused: In fast mode, just show all existing temp files as unused,
  and avoid expensive scan for other unused content.
2011-03-22 17:41:06 -04:00
Joey Hess
aa2d8e33df free space checking
Free space checking is now done, for transfers of data for keys that have free space metadata.
(Notably, not for SHA* keys generated with git-annex 0.24 or earlier.)

The code is believed to work on Linux, FreeBSD, and OSX; check compile-time
messages to see if it is not enabled for your OS.
2011-03-22 17:27:04 -04:00
Joey Hess
09b16afe02 releasing version 0.20110320 2011-03-20 18:11:00 -04:00
Joey Hess
6a2a17658c No longer auto-upgrade to repository format 2, to avoid accidental upgrades, etc. Use git-annex upgrade when you're ready to run this version. 2011-03-19 18:33:39 -04:00
Joey Hess
828a84ba33 Add version command to show git-annex version as well as repository version information. 2011-03-19 14:33:24 -04:00
Joey Hess
0663f14cf7 Fix support for remotes with '.' in their names. 2011-03-18 16:29:42 -04:00
Joey Hess
7b5b127608 Fix dropping of files using the URL backend. 2011-03-17 11:49:21 -04:00
Joey Hess
3a020e599e Merge branch 'master' into reorg
Conflicts:
	debian/changelog
2011-03-16 18:47:04 -04:00
Joey Hess
1079ade208 releasing version 0.24 2011-03-16 18:41:02 -04:00
Joey Hess
63360f7767 update 2011-03-16 18:33:28 -04:00
Joey Hess
00eb8ae829 prepping experimental release 2011-03-16 16:25:20 -04:00
Joey Hess
d7ef5fd294 add explicit upgrade command 2011-03-16 15:48:26 -04:00
Joey Hess
0f8edc99ee Merge branch 'master' into reorg
Conflicts:
	debian/changelog
2011-03-16 13:48:04 -04:00
Joey Hess
35cbd107d5 detect systems w/o utmensat and ifdef out code that needs it 2011-03-16 13:46:08 -04:00
Joey Hess
5eb76d2b03 improve upgrade 2011-03-16 11:53:46 -04:00
Joey Hess
a080799900 upgrades seem to fully work 2011-03-16 11:00:18 -04:00
Joey Hess
500c4e44c5 v1 -> v2 upgrade partially working
still need to move location log files, and auto-commit
2011-03-16 02:35:48 -04:00
Joey Hess
f1e010f42e upgrade thoughts
long comments :)
2011-03-16 00:32:15 -04:00
Joey Hess
09a7689bc3 update and bug closures for v2 layout 2011-03-16 00:08:02 -04:00
Joey Hess
27472710c7 initial pass at doc update 2011-03-15 22:19:44 -04:00
Joey Hess
bc5c54c987 symlink touching fun
When adding files to the annex, the symlinks pointing at the annexed
content are made to have the same mtime as the original file. While git
does not preserve that information, this allows a tool like metastore to be
used with annexed files.
2011-03-14 23:00:23 -04:00
Joey Hess
175d055d4d Add Suggests on graphviz. Closes: #618039 2011-03-13 14:25:32 -04:00
Joey Hess
72d2684016 Rethink filename encoding handling for display. Since filename encoding may or may not match locale settings, any attempt to decode filenames will fail for some files. So instead, do all output in binary mode. 2011-03-12 15:30:17 -04:00
Joey Hess
26544de946 put in utf8 forcing workaround
Haskell's IO layer crashes on characters > 255 when in a non-unicode (latin1)
locale. Until Haskell gets better behavior, put in an admittedly ugly
workaround for that: git-annex forces utf8 output mode no matter what
locale is selected. So if you use a non-utf8 locale, your filenames with
characters > 127 will not be displayed as you'd expect. But at least it
won't crash.
2011-03-08 18:05:20 -04:00
Joey Hess
0de3005c64 whereis: New subcommand to show where a file's content has gotten to. 2011-03-05 17:23:55 -04:00
Joey Hess
6c1607ce66 Support ssh remotes with a port specified. 2011-03-05 15:47:00 -04:00
Joey Hess
e9fcd1eb5b releasing version 0.22 2011-03-04 15:23:04 -04:00
Joey Hess
c5c7eaf009 prep for release 2011-03-03 21:56:03 -04:00
Joey Hess
bc2df77642 Bugfix: When fsck detected and moved away corrupt file content, it did not update the location log. 2011-03-03 21:34:30 -04:00
Joey Hess
42259eee92 support git funky remote syntaxes
* Look for dir.git directories the same as git does.
* Support remote urls specified as relative paths.
* Support non-ssh remote paths that contain tilde expansions.
2011-03-03 21:02:29 -04:00
Joey Hess
1de12a2918 document describe command 2011-03-03 16:58:52 -04:00
Joey Hess
b5b78f26ec fix up commands that are trouble on bare repos
Most will just abort. init does a basic init and gives a command to
run elsewhere to finish it.
2011-03-03 16:40:55 -04:00
Joey Hess
a9d0538da5 updates for bare repo support 2011-03-03 15:59:16 -04:00
Joey Hess
6206b46e60 fsck: Check for and repair location log damage. 2011-03-02 14:30:36 -04:00
Joey Hess
1b9c4477fb New backends: SHA512 SHA384 SHA256 SHA224 2011-03-01 17:07:15 -04:00
Joey Hess
836e71297b Support filenames that start with a dash; when such a file is passed to a utility it will be escaped to avoid it being interpreted as an option. 2011-02-25 01:13:01 -04:00
Joey Hess
3390183400 Make test suite not rely on a working cp -pr.
(The Unix wars are still ON!)
2011-02-13 14:19:14 -04:00