Commit graph

3477 commits

Author SHA1 Message Date
Joey Hess
a86d937b5b avoid too long filename when making up a filename for addurl too 2012-02-16 02:09:09 -04:00
Joey Hess
8f9b501515 handle really long urls
Using the whole url as a key can make the filename too long. Truncate
and use a md5sum for uniqueness if necessary.
2012-02-16 02:05:06 -04:00
Joey Hess
a1e52f0ce5 hlint 2012-02-16 00:44:51 -04:00
Joey Hess
e7aaa55c53 create parent directories as needed for addurl --file 2012-02-16 00:05:49 -04:00
Joey Hess
7d1c09fe4a update 2012-02-15 19:46:29 -04:00
Joey Hess
4d8afc1713 tweak wording 2012-02-15 19:43:15 -04:00
Joey Hess
63152428e9 changelog 2012-02-15 17:33:21 -04:00
Joey Hess
756c236ec7 Merge branch 'master' of ssh://git-annex.branchable.com 2012-02-15 14:36:47 -04:00
Joey Hess
505d6b1a06 fix failure count memory leak
This is the last memory leak that prevents git-annex from running
in constant space, as far as I can see. I can now run git annex find
dummied up to repeatedly find the same file over and over, on millions
olf files, and memory stays entirely constant.
2012-02-15 14:35:49 -04:00
Joey Hess
4645f83678 add tips 2012-02-15 14:34:40 -04:00
Joey Hess
f0f07db01d reorder prams and put -- after atrributes, for compatability with old git
(cherry picked from commit c8ec0e233e)
2012-02-15 14:01:06 -04:00
http://joey.kitenet.net/
623a42b0e9 Added a comment 2012-02-15 15:22:56 +00:00
Joey Hess
88b3ee8968 Merge branch 'master' of ssh://git-annex.branchable.com 2012-02-15 11:16:28 -04:00
Joey Hess
52c5b164d8 Added a annex.queuesize setting
useful when adding hundreds of thousands of files on a system with plenty
of memory.

git add gets quite slow in such a large repository, so if the system has
more than the ~32 mb of memory the queue can use by default, it's a useful
optimisation to increase the queue size, in order to decrease the number
of times git add is run.
2012-02-15 11:14:19 -04:00
antymat
d380c18c1e Added a comment 2012-02-15 07:13:12 +00:00
http://joey.kitenet.net/
e04e05ef1b Added a comment 2012-02-14 22:57:29 +00:00
Joey Hess
c26db26259 add scalability page 2012-02-14 18:50:25 -04:00
antymat
586e937ad0 Added a comment 2012-02-14 22:48:38 +00:00
Joey Hess
7371209d13 layout 2012-02-14 17:27:13 -04:00
Joey Hess
9da8bb2846 typo 2012-02-14 17:22:56 -04:00
Joey Hess
29dede039c add video tag with RichiH's talk 2012-02-14 17:19:48 -04:00
Joey Hess
e76988f6c2 add 2012-02-14 16:28:16 -04:00
Joey Hess
03c559f8d6 tweak 2012-02-14 14:51:26 -04:00
Joey Hess
7ebd98d8d8 fix memory leak when staging the journal
The list of files had to be retained until the end so it could be deleted.
Also, a list of update-index lines was generated and only then fed into it.
Now everything streams in constant space.
2012-02-14 14:37:59 -04:00
Joey Hess
cdd6cdbb67 Merge branch 'master' of ssh://git-annex.branchable.com 2012-02-14 13:03:51 -04:00
http://joey.kitenet.net/
fa7ffd1cc3 Added a comment 2012-02-14 16:58:33 +00:00
Joey Hess
90a8b38ac0 set oneshot mode on a per-command basis
Avoids ugly (and test suite failing) hack in Command.Version
2012-02-14 12:40:40 -04:00
antymat
33e03d58ae spelling 2012-02-14 16:39:17 +00:00
antymat
0e3f7b64b6 2012-02-14 16:34:27 +00:00
Joey Hess
a40ec5e03e Fixed a memory leak due to excessive strictness when committing journal files.
When hashing the files, the entire list of shas was read strictly.
That was entirely unnecessary, since there's a cleanup action run
after they're consumed.
2012-02-14 11:20:34 -04:00
Joey Hess
82ae30d820 don't close yet 2012-02-14 11:02:31 -04:00
Joey Hess
2f1f1e6b13 avoid version saving state
This is not the place to commit journal files.
2012-02-14 10:59:48 -04:00
Joey Hess
8f76d66f32 set fileEncoding on CheckAttr handles
Seemed to work without it, but this is correct.
2012-02-14 04:31:39 -04:00
Joey Hess
cb631ce518 whereis: Prints the urls of files that the web special remote knows about. 2012-02-14 03:49:48 -04:00
Joey Hess
8fbc529d68 oops 2012-02-14 03:10:01 -04:00
Joey Hess
afd33b0236 simplify 2012-02-14 01:11:02 -04:00
Joey Hess
2b28c70f5f add, and immediately close bug. useful documentation though 2012-02-14 01:01:38 -04:00
Joey Hess
a2f241d503 fix LsFiles.typeChanged paths
Passing absolute paths to Command.Add used to work, but after recent
changes doesn't. All LsFiles should use relative paths anyway, so fix it
there.
2012-02-14 00:22:42 -04:00
Joey Hess
cbaebf538a rework git check-attr interface
Now gitattributes are looked up, efficiently, in only the places that
really need them, using the same approach used for cat-file.

The old CheckAttr code seemed very fragile, in the way it streamed files
through git check-attr.
I actually found that cad8824852
was still deadlocking with ghc 7.4, at the end of adding a lot of files.
This should fix that problem, and avoid future ones.

The best part is that this removes withAttrFilesInGit and withNumCopies,
which were complicated Seek methods, as well as simplfying the types
for several other Seek methods that had a Backend tupled in.
2012-02-13 23:52:21 -04:00
Joey Hess
d35a8d85b5 another place hGetBoth was used without a writer thread 2012-02-13 20:23:45 -04:00
Joey Hess
cad8824852 thinko
I removed the now unnecessary forkProcess, but forgot to change back to
pipeBoth, so there was no writer thread.
2012-02-13 20:01:37 -04:00
Joey Hess
0ef6d86873 force state strictly
When converting to the strict state monad, I missed this place where
thunks to the state could be built up, possibly. This seems to make
it run in some percentage less memory.
2012-02-13 16:59:00 -04:00
Joey Hess
59b2adea4f changelog for a964012fc3
Turns out that commit really made some serious improvements to memory use.
With the lazy state monad, git-annex add in a huge tree grew seemingly
without bound until it overflowed the stack. With the strict monad,
it uses 42 mb max.

It's possible another change since the 3.20120123 release fixed that,
but a964012fc3 seems most likely.
2012-02-13 16:58:58 -04:00
Joey Hess
3ac2677e00 comment typo 2012-02-13 16:58:26 -04:00
Joey Hess
ecfcb41abe work around Network.Browser bug that converts a HEAD to a GET when following a redirect
The code explicitly switches from HEAD to GET for most redirects.
Possibly because someone misread a spec (which does require switching from
POST to GET for 303 redirects). Or possibly because the spec really is that
bad. Upstream bug: https://github.com/haskell/HTTP/issues/24

Since we absolutely don't want to download entire (large) files from
the web when checking that they exist with HEAD, I wrote my own redirect
follower, based closely on the one used by Network.Browser, but without
this misfeature.

Note that Network.Browser checks that the redirect url is a http url
and fails if not. I don't, because I want to not need to change this
code when it gets https support (related: I'm surprised to see it
doesn't support https yet..). The check does not seem security significant;
it doesn't support file:// urls for example. If a http url is redirected
to https, the Network.Browser will actually make a http connection again.
This could loop, but only up to 5 times.
2012-02-10 21:54:25 -04:00
Joey Hess
6335abcab2 doc update 2012-02-10 20:40:18 -04:00
Joey Hess
a3ebf16e62 also verify new urls when adding them to existing files 2012-02-10 19:40:54 -04:00
Joey Hess
17fed709c8 addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key. 2012-02-10 19:23:46 -04:00
Joey Hess
9030f68452 When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key.
If there's no Content-Length, or the key has no size, this check is not
done, but it should happen most of the time, and protect against web
content that has changed.
2012-02-10 19:23:41 -04:00
Joey Hess
fa77d9486d Merge branch 'master' of ssh://git-annex.branchable.com 2012-02-09 21:53:51 -04:00