Commit graph

205 commits

Author SHA1 Message Date
Joey Hess
09cd042775 Properly handle multiline git config values.
A crash on parsing was fixed a while ago. This adds support for fully
correctly parsing multiline git config values, using git config --null.

Since git-annex-shell configlist uses normal git config output, I left in
support for that too; the two forms of config output can be easily
identified by the parser. Since configlist only prints the annex.uuid
config, there's no risk of multiline values there, so no need to change it.
2011-12-15 12:48:27 -04:00
Joey Hess
ef28b3fef7 split out Git/Command.hs 2011-12-14 15:56:11 -04:00
Joey Hess
02f1bd2bf4 split more stuff out of Git.hs 2011-12-14 15:43:13 -04:00
Joey Hess
13fff71f20 split out three modules from Git
Constructors and configuration make sense in separate modules.
A separate Git.Types is needed to avoid cycles.
2011-12-13 15:06:49 -04:00
Joey Hess
98dfc0c9b0 split out Annex/BranchState.hs 2011-12-12 17:38:46 -04:00
Joey Hess
c7e65bbb12 optimiation
avoids reading the config of a local remote twice in a row
2011-12-12 02:24:37 -04:00
Joey Hess
f44f715f51 ensure local remote is initialized when copying to it
Needed due to this scenario: Bare repo origin is made, foo is cloned from it;
foo is initalized; a file is added to foo's annex; git annex move --to origin

Since the git-annex branch has not yet been pushed to origin, it doesn't
auto-initialize. When the content is sent to it, it's stored, but
the remote has NoUUID, and so nothing is logged in the location log.
Then the content is removed from the local repo, and git-annex has lost
track of it.

git annex fsck in origin will find the lost content, but let's not let this
happen. Content should only be sent to initalized remotes.

This cannot happen for non-local remotes, since git-annex-shell always
checks that the repo is initialized.
2011-12-10 19:54:20 -04:00
Joey Hess
9ba99a544b update 2011-12-10 18:51:01 -04:00
Joey Hess
d64132a43a hslint 2011-12-09 01:57:13 -04:00
Joey Hess
e3f1568e0f Fix caching of decrypted ciphers, which failed when drop had to check multiple different encrypted special remotes. 2011-12-08 16:01:46 -04:00
Joey Hess
64672c6262 refactor 2011-12-03 09:10:23 -04:00
Joey Hess
e19dc85547 factor out untilTrue 2011-12-02 16:12:31 -04:00
Joey Hess
fb68a7881f convert rsync special backend to using both hash directory types 2011-12-02 15:50:27 -04:00
Joey Hess
db5b479f3f use lowercase hash by default; non-bare repos are a special case
Directory special remotes will now always store keys in the lowercase name,
which avoids the complication of catching failures to create the mixed case
name.

Git remotes using http will now try the lowercase name first.
2011-12-02 14:56:48 -04:00
Joey Hess
0815cc2fc1 refactor 2011-12-02 14:47:59 -04:00
Joey Hess
bff6ca2634 refactor 2011-11-28 23:20:31 -04:00
Joey Hess
da9cd315be add support for using hashDirLower in addition to hashDirMixed
Supporting multiple directory hash types will allow converting to a
different one, without a flag day.

gitAnnexLocation now checks which of the possible locations have a file.
This means more statting of files. Several places currently use
gitAnnexLocation and immediately check if the returned file exists;
those need to be optimised.
2011-11-28 22:43:51 -04:00
Joey Hess
75a590bdd8 Put a workaround in the directory special remote for strange behavior with VFAT filesystems on Linux (mounted with shortname=mixed) 2011-11-22 18:21:28 -04:00
Joey Hess
1326bb8635 Avoid excessive escaping for rsync special remotes that are not accessed over ssh.
This is actually tricky, 45bbf210a1 added
the escaping because it's needed for rsync that does go over ssh.
So I had to detect whether the remote's rsync url will use ssh or not,
and vary the escaping.
2011-11-18 12:53:48 -04:00
Joey Hess
637b5feb45 lint 2011-11-11 01:52:58 -04:00
Joey Hess
49d2177d51 factored out some useful error catching methods 2011-11-10 20:57:28 -04:00
Joey Hess
d3e1a3619f safer inannex checking
git-annex-shell inannex now returns always 0, 1, or 100 (the last when
it's unclear if content is currently in the index due to it currently being
moved or dropped).

(Actual locking code still not yet written.)
2011-11-09 18:33:15 -04:00
Joey Hess
56b8194470 cleanup 2011-11-09 01:33:20 -04:00
Joey Hess
bf460a0a98 reorder repo parameters last
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.

In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.

This also provides more opportunities to use monadic and applicative
combinators.
2011-11-08 16:27:20 -04:00
Joey Hess
b11a63a860 clean up read/show abuse
Avoid ever using read to parse a non-haskell formatted input string.

show :: Key is arguably still show abuse, but displaying Keys as filenames
is just too useful to give up.
2011-11-08 00:17:54 -04:00
Joey Hess
63a292324d add a UUID type
Should have done this a long time ago.
2011-11-07 15:59:16 -04:00
Joey Hess
aae0417d94 Don't try to read config from repos with annex-ignore set. 2011-11-07 11:50:30 -04:00
Joey Hess
c879eb873e do commit location changes to remote in copy --to
test suite pointed out that if a file was copied from B to A, and then
A cloned, the clone ought to immediatly know it can get the file from A.
2011-10-27 18:03:36 -04:00
Joey Hess
f84d66fa15 reap in onLocal
Each onLocal call involves a new Annex state, so needs to clean up after it.
2011-10-27 14:55:07 -04:00
Joey Hess
c30366e95a improve config reading when operating on remote on same host
Before the config was read each time onLocal was called, and entirely
redundantly since it's read for same-host remotes on startup.

Also a minor bug fix: When rsyncing to a same-host remote, use the
rsync-options from the repository that the user ran git-annex in, not those of
the receiving repository.
2011-10-27 14:55:06 -04:00
Joey Hess
373cad993d Sped up some operations on remotes that are on the same host.
Specifically, disabled trying to update the git-annex branch on the remote,
since that data is never used by operations that act on such remotes.

Also, when copying content to such a remote, skip committing the presence
information changes to its git-annex branch. Leaving it in the journal there
is ok: Any command run on the remote that needs the info will flush the
journal.

This may partially solve this bug:
http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/
Although I still see unreaped git processes piling up when doing a copy --to.
2011-10-27 14:55:06 -04:00
Joey Hess
23f2a12816 broke up Utility 2011-10-16 00:50:12 -04:00
Joey Hess
91366c896d clean Annex stuff out of Utility/ 2011-10-16 00:04:26 -04:00
Joey Hess
1480d71adb fix 2011-10-15 18:45:32 -04:00
Joey Hess
ee9af605bc break out non-log stuff to separate module 2011-10-15 17:47:03 -04:00
Joey Hess
ec169f84b1 migrate: Copy url logs for keys when migrating. 2011-10-15 16:36:56 -04:00
Joey Hess
b4015064e1 break web log handling into a separate module 2011-10-15 16:25:51 -04:00
Joey Hess
1a29b5b52e reorganize log modules
no code changes
2011-10-15 16:21:08 -04:00
Joey Hess
9fa9214106 A remote can have a annexUrl configured, that is used by git-annex instead of its usual url. (Similar to pushUrl.) 2011-10-14 18:18:28 -04:00
Joey Hess
b505ba83e8 minor syntax changes 2011-10-11 14:43:45 -04:00
Joey Hess
6a6ea06cee rename 2011-10-05 16:02:51 -04:00
Joey Hess
cfe21e85e7 rename 2011-10-04 00:59:08 -04:00
Joey Hess
8ef2095fa0 factor out common imports
no code changes
2011-10-03 23:29:48 -04:00
Joey Hess
4bf1a5ef59 refactor 2011-09-23 18:13:24 -04:00
Joey Hess
9f6b7935dd go go gadget hlint 2011-09-20 23:24:48 -04:00
Joey Hess
dd463a3100 rework annex-ignore handling
Only one place need to filter the list of remotes for ignored remotes:
keyPossibilities. Make the full list available to everything else.

This allows getting rid of the special case handing for --from and --to
to make ignored remotes not be ignored with those options.
2011-09-18 20:11:39 -04:00
Joey Hess
999d5df90b factor out firstM and anyM
Control.Monad.Loops has these, but has no Debian package yet.
2011-08-28 15:46:49 -04:00
Joey Hess
f82da1d9dc show a message if asked to get something from the web that is not there 2011-08-27 07:08:15 -04:00
Joey Hess
203148363f split groups of related functions out of Utility 2011-08-22 16:14:12 -04:00
Joey Hess
737b5d14c9 moved files around 2011-08-20 16:11:42 -04:00
Joey Hess
ec746c511f note about why curl -# is used
I'd rather use wget really, but as git-annex uses libcurl elsewhere, it
seems best to stick with curl. And making this configurable seems
overboard.
2011-08-20 12:52:29 -04:00
Joey Hess
b7a4ff1c31 optimise initialized check
Avoid running external command if annex.version is set.
2011-08-17 18:38:26 -04:00
Joey Hess
32f27cc3e8 when reading configs of local repos, first initializeSafe
This auto-generates a uuid if the local repo does not already have one.
2011-08-17 14:44:31 -04:00
Joey Hess
f5449aae16 error out when dropping from http repo 2011-08-16 21:20:14 -04:00
Joey Hess
5ccb926b51 support for getting files from http git remotes 2011-08-16 21:04:23 -04:00
Joey Hess
a55faff08f reorg Remote/* 2011-08-16 20:49:54 -04:00
Joey Hess
4545a0e78c split out generic url stuff into a helper library from Remote.Web 2011-08-16 20:49:44 -04:00
Joey Hess
07f2e7ee72 support reading git config from http remotes
The config file is downloaded to a temp file, and git-config run on that
to parse it.
2011-08-16 20:48:11 -04:00
Joey Hess
dd8e649f49 fix file name for web remote log files
The key name was not being sufficiently escaped, although it didn't break
anything due to luck. Switch to properly escaped key names for the log
filename, with a fallback to the buggy old name.
2011-08-06 14:45:58 -04:00
Joey Hess
45bbf210a1 Fix shell escaping in rsync special remote. 2011-07-29 15:28:21 +02:00
Joey Hess
00153eed48 unify elipsis handling
And add a simple dots-based progress display, currently only used in v2
upgrade.
2011-07-19 14:07:23 -04:00
Joey Hess
6c396a256c finished hlint pass 2011-07-15 12:47:14 -04:00
Joey Hess
cab4ac247c rename 2011-07-05 20:36:43 -04:00
Joey Hess
c98b5cf36e rename 2011-07-05 20:24:10 -04:00
Joey Hess
9f1577f746 remove unused backend machinery
The only remaining vestiage of backends is different types of keys. These
are still called "backends", mostly to avoid needing to change user interface
and configuration. But everything to do with storing keys in different
backends was gone; instead different types of remotes are used.

In the refactoring, lots of code was moved out of odd corners like
Backend.File, to closer to where it's used, like Command.Drop and
Command.Fsck. Quite a lot of dead code was removed. Several data structures
became simpler, which may result in better runtime efficiency. There should
be no user-visible changes.
2011-07-05 19:57:46 -04:00
Joey Hess
5c69ac14eb Drop the dependency on the haskell curl bindings, use regular haskell HTTP. 2011-07-04 19:33:11 -04:00
Joey Hess
e6b9539a65 make curl follow redirs 2011-07-01 21:52:27 -04:00
Joey Hess
ace9de37e8 download urls via tmp file, and support resuming 2011-07-01 18:59:40 -04:00
Joey Hess
79016c197c add hashing to web log files 2011-07-01 17:23:01 -04:00
Joey Hess
6bddebdb79 add the addurl command 2011-07-01 17:15:46 -04:00
Joey Hess
cdbcd6f495 add web special remote
Generalized LocationLog to PresenceLog, and use a presence log to record
urls for the web special remote.
2011-07-01 15:30:42 -04:00
Joey Hess
f6063a094e renamed GitRepo to Git
It was always imported qualified as Git anyway
2011-06-30 13:21:39 -04:00
Joey Hess
c4e6730042 commit git-annex branch when copying to a remote (locally)
Otherwise, the location log changes are only staged in its index,
and this can confuse matters if pulling or cloning from the remote.

The test suite was failing because this wasn't done.
2011-06-22 21:21:09 -04:00
Joey Hess
d0482d4154 bigfix: stat parent dirs 2011-06-13 21:46:28 -04:00
Joey Hess
30d7cce7ec rsync is now used when copying files from repos on other filesystems
cp is still used when copying file from repos on the same filesystem, since
--reflink=auto can make it significantly faster on filesystems such as
btrfs.

Directory special remotes still use cp, not rsync. It's not clear what
tmp file should be used when rsyncing to such a remote.
2011-06-13 20:33:52 -04:00
Joey Hess
19428ea2f4 fix building with S3 stub 2011-06-10 12:11:34 -04:00
Joey Hess
703c437bd9 rename modules for data types into Types/ directory 2011-06-01 21:56:04 -04:00
Joey Hess
93a4f3d4e6 Add --debug option. Closes: #627499
This takes advantage of the debug logging done by missingh, and I added
my own debug messages for executeFile calls. There are still some other
low-level ways git-annex runs stuff that are not shown by debugging,
but this gets most of it easily.
2011-05-21 11:52:13 -04:00
Joey Hess
21d9c84e72 more standard names for whenM and unlessM operators
These are defined in ifelse, but it's not currently available and I don't
want to pull in a library for 6 lines of code anyhow.

Also, ifelse sets the fixity to 1, which does not allow >>? error $ ...
2011-05-17 11:45:24 -04:00
Joey Hess
c91929f693 add whenM and unlessM
Just more golfing.. I am pretty sure something in a library somewhere can
do this, but I have been unable to find it.
2011-05-17 03:13:11 -04:00
Joey Hess
760cde28b6 more pointless monadic golfing 2011-05-16 14:49:28 -04:00
Joey Hess
0a7bcd47ae IA: do not create bucket at initremote time
This way, the metadata sent when uploading a file is applied to the bucket
then.
2011-05-16 13:10:26 -04:00
Joey Hess
1d2984441c add a few tweaks to make it easy to use the Internet Archive's variant of S3
In particular, munge key filenames to comply with the IA's filename limits,
disable encryption, support their nonstandard way of creating buckets, and
allow x-amz-* headers to be specified in initremote to set item metadata.

Still TODO: initremote does not handle multiword metadata headers right.
2011-05-16 11:20:35 -04:00
Joey Hess
79c74bf27d refactor 2011-05-16 09:42:54 -04:00
Joey Hess
3e15a8a791 Maybe reduction pass 2 2011-05-15 12:25:58 -04:00
Joey Hess
cad0e1c8b7 simplified a bunch of Maybe handling 2011-05-15 03:38:08 -04:00
Joey Hess
3c319cd844 avoid always decrypting cipher
Last change moved cipher decryption to remote setup time.
Fixed this with a bit of a hack.
2011-05-01 15:13:54 -04:00
Joey Hess
2ddade8132 factor out base64 code 2011-05-01 14:27:40 -04:00
Joey Hess
1f84c7a964 S3: When encryption is enabled, the Amazon S3 login credentials are stored, encrypted, in .git-annex/remotes.log, so environment variables need not be set after the remote is initialized. 2011-05-01 14:05:10 -04:00
Joey Hess
cf501d3b9b set ANNEX_HASH_* always 2011-04-29 14:04:20 -04:00
Joey Hess
3ab3f41aea hook special remote implemented, and tested 2011-04-28 17:21:45 -04:00
Joey Hess
d7b330b33b Fix hasKeyCheap setting for bup and rsync special remotes. 2011-04-28 14:39:51 -04:00
Joey Hess
39966ba4ee filter out --delete rsync option
rsync does not have a --no-delete, so do it this way instead
2011-04-27 20:31:56 -04:00
Joey Hess
e68f128a9b rsync special remote
Fully tested and working, including resuming and encryption. (Though not
resuming when sending *with* encryption; gpg doesn't produce identical
output each time.)

Uses same layout as the directory special remote and the .git/annex/objects/
directory.
2011-04-27 20:23:09 -04:00
Joey Hess
45bdb2d413 ensure tmp dir exists 2011-04-21 10:53:29 -04:00
Joey Hess
6fcd3e1ef7 fix S3 upload buffering problem
Provide file size to new version of hS3.
2011-04-21 10:33:17 -04:00
Joey Hess
4837176897 update on memory leak
Finished applying to S3 the change that fixed the memory leak in bup, but
it didn't seem to help S3.. with encryption it still grows to 2x file size.
2011-04-19 16:31:35 -04:00
Joey Hess
5985acdfad bup: Avoid memory leak when transferring encrypted data.
This was a most surprising leak. It occurred in the process that is forked
off to feed data to gpg. That process was passed a lazy ByteString of
input, and ghc seemed to not GC the ByteString as it was lazily read
and consumed, so memory slowly leaked as the file was read and passed
through gpg to bup.

To fix it, I simply changed the feeder to take an IO action that returns
the lazy bytestring, and fed the result directly to hPut.

AFAICS, this should change nothing WRT buffering. But somehow it makes
ghc's GC do the right thing. Probably I triggered some weakness in ghc's
GC (version 6.12.1).

(Note that S3 still has this leak, and others too. Fixing it will involve
another dance with the type system.)

Update: One theory I have is that this has something to do with
the forking of the feeder process. Perhaps, when the ByteString
is produced before the fork, ghc decides it need to hold a pointer
to the start of it, for some reason -- maybe it doesn't realize that
it is only used in the forked process.
2011-04-19 15:27:03 -04:00
Joey Hess
b1274b6378 refactor 2011-04-19 14:50:09 -04:00
Joey Hess
a441e08da1 Fix stalls in S3 when transferring encrypted data.
Stalls were caused by code that did approximatly:

content' <- liftIO $ withEncryptedContent cipher content return
store content'

The return evaluated without actually reading content from S3,
and so the cleanup code began waiting on gpg to exit before
gpg could send all its data.

Fixing it involved moving the `store` type action into the IO monad:

liftIO $ withEncryptedContent cipher content store

Which was a bit of a pain to do, thank you type system, but
avoids the problem as now the whole content is consumed, and
stored, before cleanup.
2011-04-19 14:45:19 -04:00