Commit graph

203 commits

Author SHA1 Message Date
Joey Hess
1f83dafc7e Bugfix: Fix fsck in SHA*E backends, when the key contains composite extensions, as added in 3.20120721. 2012-08-24 12:16:17 -04:00
Joey Hess
9fc94d780b better readProcess 2012-07-19 00:57:40 -04:00
Joey Hess
1db7d27a45 add back debug logging
Make Utility.Process wrap the parts of System.Process that I use,
and add debug logging to them.

Also wrote some higher-level code that allows running an action
with handles to a processes stdin or stdout (or both), and checking
its exit status, all in a single function call.

As a bonus, the debug logging now indicates whether the process
is being run to read from it, feed it data, chat with it (writing and
reading), or just call it for its side effect.
2012-07-19 00:46:52 -04:00
Joey Hess
d1da9cf221 switch from System.Cmd.Utils to System.Process
Test suite now passes with -threaded!

I traced back all the hangs with -threaded to System.Cmd.Utils. It seems
it's just crappy/unsafe/outdated, and should not be used. System.Process
seems to be the cool new thing, so converted all the code to use it
instead.

In the process, --debug stopped printing commands it runs. I may try to
bring that back later.

Note that even SafeSystem was switched to use System.Process. Since that
was a modified version of code from System.Cmd.Utils, it needed to be
converted too. I also got rid of nearly all calls to forkProcess,
and all calls to executeFile, which I'm also doubtful about working
well with -threaded.
2012-07-18 18:00:24 -04:00
Joey Hess
8ad844e45c fix leading period before two-element extensions 2012-07-06 17:22:56 -06:00
Joey Hess
5a753a7b8a SHAnE backends are now smarter about composite extensions, such as .tar.gz Closes: #680450 2012-07-05 16:24:02 -06:00
Joey Hess
40729e7fa2 Use SHA library for files less than 50 kb in size, at which point it's faster than forking the more optimised external program. 2012-07-04 13:04:01 -04:00
Joey Hess
1da79ea61f When shaNsum commands cannot be found, use the Haskell SHA library (already a dependency) to do the checksumming. This may be slower, but avoids portability problems.
Using Crypto's version of the hashes would be another option.
I need to benchmark it. The SHA2 library (which provides SHA1 also,
confusing name) may be the fastest option, but is not currently in Debian.
2012-07-04 09:11:36 -04:00
Joey Hess
e0fdfb2e70 maintain set of files pendingAdd
Kqueue needs to remember which files failed to be added due to being open,
and retry them. This commit gets the data in place for such a retry thread.

Broke KeySource out into its own file, and added Eq and Ord instances
so it can be stored in a Set.
2012-06-20 16:31:46 -04:00
Joey Hess
d3cee987ca separate source of content from the filename associated with the key when generating a key
This already made migrate's code a lot simpler.
2012-06-05 19:51:03 -04:00
Joey Hess
2183fd2abd Require that the SHA256 backend can be used when building, since it's the default. 2012-05-31 23:15:40 -04:00
Joey Hess
8f9b501515 handle really long urls
Using the whole url as a key can make the filename too long. Truncate
and use a md5sum for uniqueness if necessary.
2012-02-16 02:05:06 -04:00
Joey Hess
17fed709c8 addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key. 2012-02-10 19:23:46 -04:00
Joey Hess
90319afa41 fsck --from
Fscking a remote is now supported. It's done by retrieving
the contents of the specified files from the remote, and checking them,
so can be an expensive operation.

(Several optimisations are possible, to speed it up, of course.. This is
the slow and stupid remote fsck to start with.)

Still, if the remote is a special remote, or a git repository that you
cannot run fsck in locally, it's nice to have the ability to fsck it.

If you have any directory special remotes, now would be a good time to
fsck them, in case you were hit by the data loss bug fixed in the
previous release!
2012-01-19 15:24:05 -04:00
Joey Hess
d36525e974 convert fsckKey to a Maybe
This way it's clear when a backend does not implement its own fsck checks.
2012-01-19 13:51:30 -04:00
Joey Hess
4a02c2ea62 type alias cleanup 2011-12-31 04:11:58 -04:00
Joey Hess
95d2391f58 more partial function removal
Left a few Prelude.head's in where it was checked not null and too hard to
remove, etc.
2011-12-15 18:19:36 -04:00
Joey Hess
480495beb4 Prevent key names from containing newlines.
There are several places where it's assumed a key can be written on one
line. One is in the format of the .git/annex/unused files. The difficult
one is that filenames derived from keys are fed into git cat-file --batch,
which has a line based input. (And no -z option.)

So, for now it's best to block such keys being created.
2011-12-06 13:03:09 -04:00
Joey Hess
da9cd315be add support for using hashDirLower in addition to hashDirMixed
Supporting multiple directory hash types will allow converting to a
different one, without a flag day.

gitAnnexLocation now checks which of the possible locations have a file.
This means more statting of files. Several places currently use
gitAnnexLocation and immediately check if the returned file exists;
those need to be optimised.
2011-11-28 22:43:51 -04:00
Joey Hess
bf460a0a98 reorder repo parameters last
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.

In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.

This also provides more opportunities to use monadic and applicative
combinators.
2011-11-08 16:27:20 -04:00
Joey Hess
ef3457196a use SHA256 by default
To get old behavior, add a .gitattributes containing: * annex.backend=WORM

I feel that SHA256 is a better default for most people, as long as their
systems are fast enough that checksumming their files isn't a problem.
git-annex should default to preserving the integrity of data as well as git
does. Checksum backends also work better with editing files via
unlock/lock.

I considered just using SHA1, but since that hash is believed to be somewhat
near to being broken, and git-annex deals with large files which would be a
perfect exploit medium, I decided to go to a SHA-2 hash.

SHA512 is annoyingly long when displayed, and git-annex displays it in a
few places (and notably it is shown in ls -l), so I picked the shorter
hash. Considered SHA224 as it's even shorter, but feel it's a bit weird.

I expect git-annex will use SHA-3 at some point in the future, but
probably not soon!

Note that systems without a sha256sum (or sha256) program will fall back to
defaulting to SHA1.
2011-11-04 15:51:01 -04:00
Joey Hess
eec137f33a Record uuid when auto-initializing a remote so it shows in status. 2011-11-02 14:18:21 -04:00
Joey Hess
c643136e32 playing with >=>
Apparently in haskell if you teach a man to fish, he'll write
more pointfree code.
2011-10-31 23:39:55 -04:00
Joey Hess
b505ba83e8 minor syntax changes 2011-10-11 14:43:45 -04:00
Joey Hess
6a6ea06cee rename 2011-10-05 16:02:51 -04:00
Joey Hess
cfe21e85e7 rename 2011-10-04 00:59:08 -04:00
Joey Hess
8ef2095fa0 factor out common imports
no code changes
2011-10-03 23:29:48 -04:00
Joey Hess
9f6b7935dd go go gadget hlint 2011-09-20 23:24:48 -04:00
Joey Hess
203148363f split groups of related functions out of Utility 2011-08-22 16:14:12 -04:00
Joey Hess
737b5d14c9 moved files around 2011-08-20 16:11:42 -04:00
Joey Hess
dede05171b addurl: --fast can be used to avoid immediately downloading the url.
The tricky part about this is that to generate a key, the file must be
present already. Worked around by adding (back) an URL key type, which
is used for addurl --fast.
2011-08-06 14:57:22 -04:00
Joey Hess
3ffc0bb4f5 foo 2011-08-06 12:50:20 -04:00
Joey Hess
00153eed48 unify elipsis handling
And add a simple dots-based progress display, currently only used in v2
upgrade.
2011-07-19 14:07:23 -04:00
Joey Hess
e784757376 hlint tweaks
Did all sources except Remotes/* and Command/*
2011-07-15 03:12:05 -04:00
Joey Hess
9f1577f746 remove unused backend machinery
The only remaining vestiage of backends is different types of keys. These
are still called "backends", mostly to avoid needing to change user interface
and configuration. But everything to do with storing keys in different
backends was gone; instead different types of remotes are used.

In the refactoring, lots of code was moved out of odd corners like
Backend.File, to closer to where it's used, like Command.Drop and
Command.Fsck. Quite a lot of dead code was removed. Several data structures
became simpler, which may result in better runtime efficiency. There should
be no user-visible changes.
2011-07-05 19:57:46 -04:00
Joey Hess
fb58d1a560 wording 2011-07-01 17:17:51 -04:00
Joey Hess
2cdacfbae6 remove URL backend 2011-07-01 16:01:04 -04:00
Joey Hess
cdbcd6f495 add web special remote
Generalized LocationLog to PresenceLog, and use a presence log to record
urls for the web special remote.
2011-07-01 15:30:42 -04:00
Joey Hess
f6063a094e renamed GitRepo to Git
It was always imported qualified as Git anyway
2011-06-30 13:21:39 -04:00
Joey Hess
e3384eb476 tweak fsck wording so file is at the end of the line 2011-06-23 19:56:24 -04:00
Joey Hess
7ee636f6dd avoid unnecessary read of trust.log 2011-06-23 13:39:04 -04:00
Joey Hess
1870186632 fixed logFile 2011-06-22 16:17:16 -04:00
Joey Hess
d3f0106f2e move LocationLog into Annex monad from IO
It will need to run in Annex so it can use Branch
2011-06-22 14:27:50 -04:00
Joey Hess
9a272815dd Bugfix: Fix fsck to not think all SHAnE keys are bad. 2011-06-10 11:43:28 -04:00
Joey Hess
90dd245522 get --from is the same as copy --from
get not honoring --from has surprised me a few times, so least surprise
suggests it should just behave like copy --from. This leaves the difference
between get and copy being that copy always requires the remote to copy
from, while get will decide whether to get a file from a key/value store or
a remote.
2011-06-09 18:54:49 -04:00
Joey Hess
703c437bd9 rename modules for data types into Types/ directory 2011-06-01 21:56:04 -04:00
Joey Hess
971ab27e78 better types allowed breaking module dep loop 2011-06-01 19:11:27 -04:00
Joey Hess
a8fb97d2ce Add --trust, --untrust, and --semitrust options. 2011-06-01 17:57:31 -04:00
Joey Hess
3d567aa64f Add --numcopies option. 2011-06-01 16:49:17 -04:00
Joey Hess
2a8efc7af1 Added filename extension preserving variant backends SHA1E, SHA256E, etc. 2011-05-16 11:46:34 -04:00
Joey Hess
5d8e0d5a1c remove unused file 2011-04-29 12:20:59 -04:00
Joey Hess
b889543507 let's use Maybe String for commands that may not be avilable 2011-04-07 21:47:56 -04:00
Fraser Tweedale
f5b2d650bb recognise differently-named shaN programs 2011-04-08 10:08:11 +10:00
Joey Hess
48418cb92b reexport RemoteClass from Remote for cleanliness 2011-03-27 17:12:32 -04:00
Joey Hess
f30320aa75 add remotes slot to Annex
This required parameterizing the type for Remote, to avoid a cycle.
2011-03-27 16:17:56 -04:00
Joey Hess
b40f253d6e start of generalizing remotes
Goal is to support multiple different types of remotes, some of which
are not git repositories. To that end, added a Remote class, and moved
git remote specific code into Remote.GitRemote.

Remotes.hs is still present as some code has not been converted to use the
new Remote class yet.
2011-03-27 16:04:25 -04:00
Joey Hess
6246b807f7 migrate: Support migrating v1 SHA keys to v2 SHA keys with size information that can be used for free space checking. 2011-03-23 17:57:10 -04:00
Joey Hess
c43e3b5c78 check key size when available, no matter the backend
Now that SHA and other backends can have size info, fsck should check it
whenever available.
2011-03-23 02:10:59 -04:00
Joey Hess
c21998722c fast mode
Add --fast flag, that can enable less expensive, but also less thurough versions of some commands.

* Add --fast flag, that can enable less expensive, but also less thurough
  versions of some commands.
* fsck: In fast mode, avoid checking checksums.
* unused: In fast mode, just show all existing temp files as unused,
  and avoid expensive scan for other unused content.
2011-03-22 17:41:06 -04:00
Joey Hess
7b5b127608 Fix dropping of files using the URL backend. 2011-03-17 11:49:21 -04:00
Joey Hess
da504f647f fromkey, and url backend download work now 2011-03-15 22:28:18 -04:00
Joey Hess
4594bd51c1 rename file 2011-03-15 22:04:50 -04:00
Joey Hess
9d49fe2c17 first pass at using new keys
It compiles. It sorta works. Several subcommands are FIXME marked and
broken, because things that used to accept separate --backend and --key
params need to be changed to accept just a --key that encodes all the key
info, now that there is metadata in keys.
2011-03-15 21:34:13 -04:00
Joey Hess
72d2684016 Rethink filename encoding handling for display. Since filename encoding may or may not match locale settings, any attempt to decode filenames will fail for some files. So instead, do all output in binary mode. 2011-03-12 15:30:17 -04:00
Joey Hess
a3daac8a8b only enable SHA backends that configure finds support for 2011-03-02 13:47:45 -04:00
Joey Hess
1b9c4477fb New backends: SHA512 SHA384 SHA256 SHA224 2011-03-01 17:07:15 -04:00
Joey Hess
b7f4801801 generic SHA size support 2011-03-01 16:50:53 -04:00
Joey Hess
4cd96ad2db rename 2011-02-28 16:25:31 -04:00
Joey Hess
fcdc4797a9 use ShellParam type
So, I have a type checked safe handling of filenames starting with dashes,
throughout the code.
2011-02-28 16:18:55 -04:00
Joey Hess
836e71297b Support filenames that start with a dash; when such a file is passed to a utility it will be escaped to avoid it being interpreted as an option. 2011-02-25 01:13:01 -04:00
Joey Hess
5a50a7cf13 update unicode FilePath handling
Based on http://hackage.haskell.org/trac/ghc/ticket/3307 ,
whether FilePath contains decoded unicode varies by OS.
So, add a configure check for it.

Also, renamed showFile to filePathToString
2011-02-11 15:37:37 -04:00
Joey Hess
fe55b4644e Fix display of unicode filenames.
Internally, the filenames are stored as un-decoded unicode.
I tried decoding them, but then haskell tries to access the wrong files.
Hmm.

So, I've unhappily chosen option "B", which is to decode filenames before
they are displayed.
2011-02-10 14:21:44 -04:00
Joey Hess
1b0edc1ab2 idiomatic elem 2011-01-30 12:13:34 -04:00
Joey Hess
167523f09d better directory handling
Rename Locations functions for better consitency, and make their values
more consistent too.

Used </> rather than manually building paths. There are still more places
that manually do so, but are tricky, due to the behavior of </> when
the second FilePath is absolute. So I only changed places where
it obviously was relative.
2011-01-27 17:00:32 -04:00
Joey Hess
5e54eb79b8 less verbose 2011-01-27 15:12:38 -04:00
Joey Hess
e1d213d6e3 make filename available to fsck messages 2011-01-26 20:37:46 -04:00
Joey Hess
3cb5cb6bf6 bring back display of keys
in fsck -q, that's the only way to know what file it means
2011-01-26 20:08:37 -04:00
Joey Hess
ee2e94f087 this should be a warning 2011-01-26 20:03:12 -04:00
Joey Hess
1a11085a50 drop: suppprt untrusted repos 2011-01-26 19:35:35 -04:00
Joey Hess
6b48f740f1 rework note 2011-01-26 17:47:02 -04:00
Joey Hess
ba748a1198 fsck: handle untrusted repos 2011-01-26 17:44:40 -04:00
Joey Hess
b7903eb2d1 move partitioning out of keyPossibilities
And a bug fix in passing.
2011-01-26 16:44:14 -04:00
Joey Hess
616d1d4a20 rename TypeInternals to BackendTypes
Now that it only contains types used by the backends
2011-01-26 00:37:50 -04:00
Joey Hess
6a97b10fcb rework config storage
Moved away from a map of flags to storing config directly in the AnnexState
structure. Got rid of most accessor functions in Annex.

This allowed supporting multiple --exclude flags.
2011-01-26 00:17:38 -04:00
Joey Hess
082b022f9a successfully split Annex and AnnexState out of TypeInternals 2011-01-25 21:49:04 -04:00
Joey Hess
109a719b03 parameterize Backend type
This allows the Backend type to not depend on the Annex type, and
so the Annex type can later be moved out of TypeInternals.
2011-01-25 21:02:34 -04:00
Joey Hess
e7b557ef5d got rid of Core module
Most of it was to do with managing annexed Content, so put there
2011-01-16 16:05:05 -04:00
Joey Hess
d134da6dab tweak message 2011-01-05 20:28:50 -04:00
Joey Hess
700aed13cf git-annex-shell now exclusively used for all remote access 2010-12-31 19:09:17 -04:00
Joey Hess
e64ffc212e support trusted repositories that are not configured as remotes 2010-12-29 16:58:44 -04:00
Joey Hess
1448d8b42d improve list of remotes in error message 2010-12-29 16:41:27 -04:00
Joey Hess
885f7048d5 Fix bug in numcopies handling when a repoisitory has multiple remotes that point to the same repository. 2010-12-29 16:31:25 -04:00
Joey Hess
d475aac375 refactor 2010-12-29 16:21:38 -04:00
Joey Hess
ee5d81429d tweak 2010-12-28 17:19:01 -04:00
Joey Hess
aa4f91b2d6 Add trust and untrust subcommands, to allow configuring remotes that are trusted to retain files without explicit checking. 2010-12-28 17:17:02 -04:00
Joey Hess
904af72b82 better message 2010-12-24 19:25:59 -04:00
Joey Hess
07648e2daa remove note that looked ugly with resume message 2010-12-02 17:52:23 -04:00
Joey Hess
fe4f1aae4b include key in message 2010-11-28 17:33:01 -04:00
Joey Hess
949e4abc56 fsck: Fix warning about not enough copies of a file, when locations are known, but are not available in currently configured remotes. 2010-11-28 17:26:15 -04:00
Joey Hess
653ad35a9f In .gitattributes, the git-annex-numcopies attribute can be used to control the number of copies to retain of different types of files. 2010-11-28 15:28:20 -04:00