Commit graph

532 commits

Author SHA1 Message Date
Joey Hess
5bf07b3b5c Store web special remote url info in a more efficient location.
storing it in remotes/web/xx/yy/foo.log meant lots of extra directory
objects in git. Now I use xx/yy/foo.log.web, which is just as unique, but
more efficient since foo.log is there anyway.

Of course, it still looks in the old location too.
2012-02-17 23:15:29 -04:00
Joey Hess
db6b4cdfcf rekey: New plumbing level command, can be used to change the keys used for files en masse. 2012-02-16 16:36:35 -04:00
Joey Hess
aeaaa0ff87 reorder 2012-02-16 15:07:59 -04:00
Joey Hess
39c3f56b33 addurl: Add --pathdepth option. 2012-02-16 12:25:19 -04:00
Joey Hess
4d8afc1713 tweak wording 2012-02-15 19:43:15 -04:00
Joey Hess
63152428e9 changelog 2012-02-15 17:33:21 -04:00
Joey Hess
52c5b164d8 Added a annex.queuesize setting
useful when adding hundreds of thousands of files on a system with plenty
of memory.

git add gets quite slow in such a large repository, so if the system has
more than the ~32 mb of memory the queue can use by default, it's a useful
optimisation to increase the queue size, in order to decrease the number
of times git add is run.
2012-02-15 11:14:19 -04:00
Joey Hess
7ebd98d8d8 fix memory leak when staging the journal
The list of files had to be retained until the end so it could be deleted.
Also, a list of update-index lines was generated and only then fed into it.
Now everything streams in constant space.
2012-02-14 14:37:59 -04:00
Joey Hess
a40ec5e03e Fixed a memory leak due to excessive strictness when committing journal files.
When hashing the files, the entire list of shas was read strictly.
That was entirely unnecessary, since there's a cleanup action run
after they're consumed.
2012-02-14 11:20:34 -04:00
Joey Hess
cb631ce518 whereis: Prints the urls of files that the web special remote knows about. 2012-02-14 03:49:48 -04:00
Joey Hess
59b2adea4f changelog for a964012fc3
Turns out that commit really made some serious improvements to memory use.
With the lazy state monad, git-annex add in a huge tree grew seemingly
without bound until it overflowed the stack. With the strict monad,
it uses 42 mb max.

It's possible another change since the 3.20120123 release fixed that,
but a964012fc3 seems most likely.
2012-02-13 16:58:58 -04:00
Joey Hess
17fed709c8 addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key. 2012-02-10 19:23:46 -04:00
Joey Hess
9030f68452 When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key.
If there's no Content-Length, or the key has no size, this check is not
done, but it should happen most of the time, and protect against web
content that has changed.
2012-02-10 19:23:41 -04:00
Joey Hess
d55f3c0716 Fix teardown of stale cached ssh connections. 2012-02-09 21:49:46 -04:00
Joey Hess
1c0bd81ba6 addurl: Normalize badly encoded urls. 2012-02-09 14:19:58 -04:00
Joey Hess
ef013506cb addurl: Added a --file option
Can be used to specify what file the url is added to. This can be used to
override the default filename that is used when adding an url, which is
based on the url. Or, when the file already exists, the url is recorded as
another location of the file.
2012-02-08 15:35:29 -04:00
Joey Hess
57a747d081 S3: Fix irrefutable pattern failure when accessing encrypted S3 credentials. 2012-02-08 11:41:15 -04:00
Joey Hess
995bf51e10 correction 2012-02-07 16:52:39 -04:00
Joey Hess
3f4f96228e changelog 2012-02-06 20:42:49 -04:00
Joey Hess
91fc975964 note 7.4 needed 2012-02-04 14:51:52 -04:00
Joey Hess
ed64bd8a4b remove; unused 2012-01-30 13:20:36 -04:00
Joey Hess
b81d662cbf Avoid repeated location log commits when a remote is receiving files.
Done by adding a oneshot mode, in which location log changes are written to
the journal, but not committed. Taking advantage of git-annex's existing
ability to recover in this situation.

This is used by git-annex-shell and other places where changes are made to
a remote's location log.
2012-01-28 15:41:52 -04:00
Joey Hess
ce5637498f remove Utility.Conditional and use IfElse
This drops the >>! and >>? with the nice low fixity. IfElse does have
undocumented >>=>>! and >>=>>? operators, but I deem that too fishy.
Anyway, using whenM and unlessM is easier; I sometimes mixed the operators
up.
2012-01-24 16:22:07 -04:00
Joey Hess
20d0288802 releasing version 3.20120123 2012-01-23 15:09:50 -04:00
Joey Hess
47250a153a ssh connection caching
Ssh connection caching is now enabled automatically by git-annex. Only one
ssh connection is made to each host per git-annex run, which can speed some
things up a lot, as well as avoiding repeated password prompts. Concurrent
git-annex processes also share ssh connections. Cached ssh connections are
shut down when git-annex exits.

Note: The rsync special remote does not yet participate in the ssh
connection caching.
2012-01-20 17:14:56 -04:00
Joey Hess
61dbad505d fsck --from remote --fast
Avoids expensive file transfers, at the expense of checking file size
and/or contents.

Required some reworking of the remote code.
2012-01-20 13:23:11 -04:00
Joey Hess
711c154561 update NEWS
Add news item recommending fscking directory special remotes.

Remote news item about URL backend being removed; it was later added back
to be used by git annex addurl --fast.

Link NEWS into top level.
2012-01-19 15:27:39 -04:00
Joey Hess
90319afa41 fsck --from
Fscking a remote is now supported. It's done by retrieving
the contents of the specified files from the remote, and checking them,
so can be an expensive operation.

(Several optimisations are possible, to speed it up, of course.. This is
the slow and stupid remote fsck to start with.)

Still, if the remote is a special remote, or a git repository that you
cannot run fsck in locally, it's nice to have the ability to fsck it.

If you have any directory special remotes, now would be a good time to
fsck them, in case you were hit by the data loss bug fixed in the
previous release!
2012-01-19 15:24:05 -04:00
Joey Hess
2837e8fef1 releasing version 3.20120116 2012-01-16 16:52:26 -04:00
Joey Hess
f161b5eb59 Fix data loss bug in directory special remote
When moving a file to the remote failed, and partially transferred content
was left behind in the directory, re-running the same move would think it
succeeded and delete the local copy.

I reproduced data loss when moving files to a partition that was almost
full. Interrupting a transfer could have similar results.

Easily fixed by using a temp file which is then moved atomically into place
once the transfer completes.

I've audited other calls to copyFileExternal, and other special remote
file transfer code; everything else seems to use temp files correctly
(rsync, git), or otherwise use atomic transfers (bup, S3).
2012-01-16 16:28:15 -04:00
Joey Hess
e3ea5fe938 debhelper v9
kills that ugly python message during build
2012-01-15 14:53:38 -04:00
Joey Hess
ce608303a3 releasing version 3.20120115 2012-01-15 14:02:32 -04:00
Joey Hess
37b5b1bf0d Fix QuickCheck dependency in cabal file. 2012-01-15 13:53:51 -04:00
Joey Hess
81856c3175 add a configure check for StatFS
This way, the build log will indicate whether StatFS can be relied on.
I've tested all the failing architectures now, and on all of them,
the StatFS code now returns Nothing, rather than Just nonsense.

Also, if annex.diskreserve is set on a platform where StatFS is not
working, git-annex will complain.

Also, the Makefile was missing the sources target used when building with
cabal.
2012-01-15 13:49:32 -04:00
Joey Hess
0eed604446 Add a sanity check for bad StatFS results.
git-annex FTBFS on s390, mips, powerpc, sparc. That StatFS code is failing
on all of them. At least on s390, the failure appears as:

Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 0,
fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0,
fsStatBytesUsed = 0})

While I don't understand why this is happening, or how to fix it,
bandaid over it by checking for obviously bad values and returning Nothing.
That disables disk free space checking, but at least git-annex will work.

Upstream bug: http://code.google.com/p/xmobar/issues/detail?id=70
2012-01-14 17:17:20 -04:00
Joey Hess
b88ecbdc1b Add libghc-testpack-dev to build depends on all arches. 2012-01-13 15:50:56 -04:00
Joey Hess
1ae780ee79 git-annex, git-union-merge: Support GIT_DIR and GIT_WORK_TREE.
Note that GIT_WORK_TREE cannot influence GIT_DIR; that is necessary for
git-fake-bare and vcsh type things to work.
2012-01-13 12:52:09 -04:00
Joey Hess
0d5c402210 Add annex-trustlevel configuration settings, which can be used to override the trust level of a remote.
This overrides the trust.log, and is overridden by the command-line trust
parameters.

It would have been nicer to have Logs.Trust.trustMap just look up the
configuration for all remotes, but a dependency loop prevented that
(Remotes depends on Logs.Trust in several ways). So instead, look up
the configuration when building remotes, storing it in the same forcetrust
field used for the command-line trust parameters.
2012-01-09 23:31:44 -04:00
Joey Hess
7675b83efa map: Fix display of remote repos
A change to break local cycles made remote repos be dropped entirely.
2012-01-08 16:05:57 -04:00
Joey Hess
a35278430a log: Add --gource mode, which generates output usable by gource.
As part of this, I fixed up how log was getting the descriptions of
remotes.
2012-01-07 18:18:09 -04:00
Joey Hess
3da28cad07 releasing version 3.20120106 2012-01-07 13:50:35 -04:00
Joey Hess
60c1aeeb6f Fix overbroad gpg --no-tty fix from last release.
Only set --no-tty when GPG_AGENT_INFO is set and batch mode is used.

In the test suite, set GPG_AGENT_INFO to /dev/null to avoid the test suite
relying on /dev/tty.
2012-01-07 12:38:08 -04:00
Joey Hess
b59759e33c typo 2012-01-06 17:52:16 -04:00
Joey Hess
a3a9f87047 log: New command that displays the location log for file, showing each repository they were added to and removed from.
This needs to run git log on the location log files to get at all past
versions of the file, which tends to be a bit slow.

It would be possible to make a version optimised for showing the location
logs for every key. That would only need to run git log once, so would be
faster, but it would need to process an enormous amount of data, so
would not speed up the individual file case.

In the future it would be nice to support log --format. log --json also
doesn't work right yet.
2012-01-06 15:40:07 -04:00
Joey Hess
f534fcc7b1 remove S3stub stuff
Let's keep that in a no-s3 branch, which can be merged into eg,
debian-stable.
2012-01-05 23:14:10 -04:00
Joey Hess
c371c40a88 Don't list S3 as a remote type when built without S3 support. 2012-01-05 23:11:07 -04:00
Joey Hess
0b27e6baa0 Support unescaped repository urls, like git does.
Turns out that git will accept a .git/config containing an url with eg,
spaces in its name. Handle this by escaping the url if it's not valid.

This also fixes support for urls containing escaped characters like %20
for space. Before, the path from the url was not unescaped properly.
2012-01-05 14:32:20 -04:00
Joey Hess
338d472ca2 releasing version 3.20120105 2012-01-05 13:51:13 -04:00
Joey Hess
769edd6b08 Run gpg with --no-tty. Closes: #654721 2012-01-05 13:44:09 -04:00
Joey Hess
a1aea174d7 fsck: Do backend-specific check before checking numcopies is satisfied.
This way, when a checksum check fails and the content is moved aside,
the numcopies check also warns if there are not enough copies.
2012-01-03 18:40:47 -04:00