Commit graph

365 commits

Author SHA1 Message Date
Joey Hess
46cc39f1a4 repair: Optimise unpacking of pack files, and avoid repeated error messages about corrupt pack files. 2014-02-24 19:36:58 -04:00
Joey Hess
4e0be2792b remove Read instance for Ref
Removed instance, got it all to build using fromRef. (With a few things
that really need to show something using a ref for debugging stubbed out.)

Then added back Read instance, and made Logs.View use it for serialization.
This changes the view log format.
2014-02-19 01:19:57 -04:00
Joey Hess
67fd06af76 add git annex view command
(And a vpop command, which is still a bit buggy.)

Still need to do vadd and vrm, though this also adds their documentation.

Currently not very happy with the view log data serialization. I had to
lose the TDFA regexps temporarily, so I can have Read/Show instances of
View. I expect the view log format will change in some incompatable way
later, probably adding last known refs for the parent branch to View
or something like that.

Anyway, it basically works, although it's a bit slow looking up the
metadata. The actual git branch construction is about as fast as it can be
using the current git plumbing.

This commit was sponsored by Peter Hogg.
2014-02-18 18:22:20 -04:00
Joey Hess
9633c67842 filter branches (incomplete)
Promosing work toward metadata driven filter branches. A few methods
to construct them are stubbed out; all the data types and pure code
seems good.

This commit was sponsored by Walter Somerville.
2014-02-16 17:39:54 -04:00
Joey Hess
61ecf76644 unbreak the build 2014-02-12 14:34:01 -04:00
Joey Hess
029a1c431a
remove windows --git-dir unix style path hack
This is no longer necessary, at least with msysgit 1.8.5.2.msysgit.0.
Its root cause may have been fixed by other recent git path fixes.
It was causing the webapp to fail to make repos on other drives.
2014-02-11 16:12:22 -04:00
Joey Hess
c95d0cf7a8 Windows: Fix handling of absolute unix-style git repository paths.
Note that on Windows a remote with a path like /home/foo/bar
is interpreted by git as being some screwy relative path (relative to what
exactly seems ill-defined -- it seemed relative to C:\Program Files\Git\ in
my tests!) So no attempt has been made to handle such a path sanely, just not
to crash when encountering it.

Note that "C:\\foo" </> "/home/foo/bar" yields /home/foo/bar even though
that is not absolute! I don't know what to make of all this,
except that I will be very happy when this crock of **** vanishes from
the face of the earth.
2014-02-08 15:39:04 -04:00
Joey Hess
92edee0b04 remove workaround
This was needed when absNormPath was not being used on Windows, since path
normalization includes removing ./
2014-02-08 14:47:57 -04:00
Joey Hess
a44e01c29c --in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}" 2014-02-06 12:43:56 -04:00
Joey Hess
ed7c61914c assistant: Run the periodic git gc in batch mode. 2014-01-22 17:11:41 -04:00
Joey Hess
78ead70ea4 repair: Check git version at run time. 2014-01-21 13:22:48 -04:00
Joey Hess
4e19e87921 repair: Fix bug in packed refs file exploding code that caused a .gitrefs directory to be created instead of .git/refs 2014-01-15 16:34:18 -04:00
Joey Hess
5e6e89f423 repair: Support old git versions from before git fsck --no-dangling was implemented. 2014-01-13 18:10:45 -04:00
Joey Hess
858eb26303 Avoid looping if long-running git cat-file or git hash-object crashes and keeps crashing when restarted. 2014-01-01 21:42:25 -04:00
Joey Hess
49aad120b9 Windows: Fix bug in direct mode merge code that could cause files in subdirectories to go missing. 2013-12-31 16:39:11 -04:00
Richard Hartmann
974fe009bf Another round of s/amoung/among/ 2013-12-19 12:30:53 -04:00
Joey Hess
c99d6a8151 assistant: Fix OSX-specific bug that caused the startup scan to try to follow symlinks to other directories, and add their contents to the annex. 2013-12-18 15:05:29 -04:00
Joey Hess
625076f9a5 status: Ignore new files that are gitignored. 2013-12-12 14:01:24 -04:00
Joey Hess
e6c4f550d8 repair: Remove damaged git-annex sync branches. 2013-12-10 16:17:49 -04:00
Joey Hess
b37323d857 update 2013-12-10 15:48:24 -04:00
Joey Hess
c0ce3269e9 accidentially committed wrong version of file 2013-12-10 15:45:22 -04:00
Joey Hess
ce045a51af Improve repair of git-annex index file.
Fixes a test case I received where a corrupted repo was repaired, but the
git-annex branch was not. The root of the problem was that the
MissingObject returned by the repair code was not necessarily a complete
set of all objects that might have been deleted during the repair.

So, stop trying to return that at all, and instead make the index file
checking code explicitly verify that each object the index uses is present.
2013-12-10 15:40:01 -04:00
Joey Hess
c717905d15 work around msysgit very strange behavior on ./ or .\ at start of path
Seems that verify_path() rejects such a path on Windows, but I cannot see
why. Git bug?
2013-12-04 23:49:18 -04:00
Joey Hess
4882a611e5 assistant: Batch jobs are now run with ionice and nocache, when those commands are available. 2013-12-01 14:53:15 -04:00
Joey Hess
03932212ec Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.

Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)
2013-12-01 13:59:45 -04:00
Joey Hess
6edac746f0 merge improved fsck types from git-repair and some associated changes 2013-11-30 14:29:11 -04:00
Joey Hess
0980f3dae6 Fix bug that broke switching between local repositories in the webapp when they use the new guarded direct mode.
git treats eg ~/annex as a bare git repository located in ~/.annex/.git
if ~/annex/.git/config has core.bare=true.
2013-11-22 23:27:15 -04:00
Joey Hess
d490bbb891 make runRepairOf run preRepair
This may be a little late, since a fsck has already been done,
but it can't hurt.
2013-11-21 20:13:55 -04:00
Joey Hess
7d682dd844 merge from git-repair 2013-11-21 20:07:44 -04:00
Joey Hess
ff2b0a9df6 merge from git-repair 2013-11-21 00:43:30 -04:00
Joey Hess
8217e97d88 merge from git-repair 2013-11-20 19:34:30 -04:00
Joey Hess
e80d935b53 merge from git-repair 2013-11-20 19:16:42 -04:00
Joey Hess
8a466247ed merge from git-repair 2013-11-20 18:45:22 -04:00
Joey Hess
7dbb702edd merge from git-repair 2013-11-20 18:31:00 -04:00
Joey Hess
ef34316c45 fix repair failure that occurred when index was corrupted, and other objects too
In this case, the index problem prevented fsck from finding the other
problems.
2013-11-19 17:16:33 -04:00
Joey Hess
b1ed98636b merge with git-repair 2013-11-19 17:08:57 -04:00
Joey Hess
b245aa40df moving git-repair to its own package 2013-11-18 13:24:55 -04:00
Joey Hess
eab4470440 better handling of missing index file 2013-11-13 14:39:26 -04:00
Joey Hess
13108b7196 assistant: Notice on startup when the index file is corrupt, and auto-repair. 2013-11-13 14:27:17 -04:00
Joey Hess
5e7e0c7dc0 repair: Handle case where index file is corrupt, but all objects are ok. 2013-11-13 13:41:02 -04:00
Joey Hess
958312885f webapp: Improve UI around remote that have no annex.uuid set, either because setup of them is incomplete, or because the remote git repository is not a git-annex repository.
Complicated by such repositories potentially being repos that should have
an annex.uuid, but it failed to be gotten, perhaps due to the past ssh repo
setup bugs. This is handled now by an Upgrade Repository button.
2013-11-07 18:02:00 -04:00
Joey Hess
59ecc804cd add new status command
This works for both direct and indirect mode.

It may need some performance tuning.

Note that unlike git status, it only shows the status of the work tree, not
the status of the index. So only one status letter, not two .. and since
files that have been added and not yet committed do not differ between the
work tree and the index, they are not shown. Might want to add display of
the index vs the last commit eventually.

This commit was sponsored by an unknown bitcoin contributor, whose
contribution as been going up lately! ;)
2013-11-07 14:07:25 -04:00
Joey Hess
3802f2f270 work around lack of receive.denyCurrentBranch in direct mode
Now that direct mode sets core.bare=true, git's normal prohibition about
pushing into the currently checked out branch doesn't work.

A simple fix for this would be an update hook which blocks the pushes..
but git hooks must be executable, and git-annex needs to be usable on eg,
FAT, which lacks x bits.

Instead, enabling direct mode switches the branch (eg master) to a special
purpose branch (eg annex/direct/master). This branch is not pushed when
syncing; instead any changes that git annex sync commits get written to
master, and it's pushed (along with synced/master) to the remote.

Note that initialization has been changed to always call setDirect,
even if it's just setDirect False for indirect mode. This is needed because
if the user has just cloned a direct mode repo, that nothing has synced
with before, it may have no master branch, and only a annex/direct/master.
Resulting in that branch being checked out locally too. Calling setDirect False
for indirect mode moves back out of this branch, to a new master branch,
and ensures that a manual "git push" doesn't push changes directly to
the annex/direct/master of the remote. (It's possible that the user
makes a commit w/o using git-annex and pushes it, but nothing I can do
about that really.)

This commit was sponsored by Jonathan Harrington.
2013-11-05 21:08:31 -04:00
Joey Hess
cf34e59c8c factor out update 2013-11-05 18:20:52 -04:00
Joey Hess
4510819215 v5 for direct mode, with automatic upgrade
This includes storing the current state of the HEAD ref, which git annex
sync is going to need, but does not make sync use it.
2013-11-05 17:05:03 -04:00
Joey Hess
04768e44b2 automatically set and unset core.bare when switching to/from direct mode 2013-11-05 15:41:24 -04:00
Joey Hess
0edd9ec03a refactored hook setup 2013-11-05 15:29:56 -04:00
Joey Hess
c2862d9585 pass -c option on to all git commands run
The -c option now not only modifies the git configuration seen by
git-annex, but it is passed along to every git command git-annex runs.

This was easy to plumb through because gitCommandLine is already used to
construct every git command line, to add --git-dir and --work-tree
2013-11-05 13:38:37 -04:00
Joey Hess
58db042033 map: Work when there are gcrypt remotes. 2013-11-04 14:14:44 -04:00
Joey Hess
7ed8e87a34 assistant: Support repairing git remotes that are locally accessible
(eg, on removable drives)

gcrypt remotes are not yet handled.

This commit was sponsored by Sören Brunk.
2013-10-27 15:38:59 -04:00
Joey Hess
0036139b33 wire git repair into webapp 2013-10-23 14:43:58 -04:00
Joey Hess
1ab2ad86c7 minor 2013-10-23 13:19:37 -04:00
Joey Hess
435ea52f3c repair command: add handling of git-annex branch and index 2013-10-23 13:00:45 -04:00
Joey Hess
d5eb85acf4 add repair command 2013-10-23 12:21:59 -04:00
Joey Hess
d345e5b52f add git fsck to cronner, and UI for repository repair (not yet wired up) 2013-10-22 16:02:52 -04:00
Joey Hess
44bb9a808f clean warnings 2013-10-22 14:52:17 -04:00
Joey Hess
ff3f654cbe make git fsck batch-capable 2013-10-22 14:49:41 -04:00
Joey Hess
3e61749d08 index file recovery 2013-10-22 12:58:04 -04:00
Joey Hess
2fb08acda5 add reflog 2013-10-21 16:41:46 -04:00
Joey Hess
18487c779f corrupt branch resetting (but not yet reflog walking) 2013-10-21 16:20:54 -04:00
Joey Hess
fcd91be6f0 implemented removal of corrupt tracking branches
Oh, git, you made this so hard. Not determining if a branch pointed to some
corrupt object, that was easy, but dealing with corrupt branches using git
plumbing is a PITA.
2013-10-21 15:28:06 -04:00
Joey Hess
6d8250c255 avoid redundant fsck when no changes are made 2013-10-20 19:42:17 -04:00
Joey Hess
4f871f89ba git-recover-repository 1/2 done 2013-10-20 17:50:51 -04:00
Joey Hess
f482de1b76 remove workaround for bug in git 1.8.4r0 2013-10-20 15:23:06 -04:00
Joey Hess
edbf177628 fix lsTreeFiles to use --full-tree
This makes it show the full tree, not just the current directory,
and enables --full-name, which yields TopFilePaths.
2013-10-18 15:50:26 -04:00
Joey Hess
c979e0ea62 fix 2013-10-17 19:51:16 -04:00
Joey Hess
c116383b5d fix 2013-10-17 19:49:44 -04:00
Joey Hess
81c4259a0d fix 2013-10-17 19:41:00 -04:00
Joey Hess
16243b9972 missing import 2013-10-17 19:39:22 -04:00
Joey Hess
e93206e294 Windows: Deal with strange msysgit 1.8.4 behavior of not understanding DOS formatted paths for --git-dir and --work-tree. 2013-10-17 19:35:57 -04:00
Joey Hess
aff125ddab try working around windows xargs problem 2013-10-17 15:56:56 -04:00
Joey Hess
d785432f78 use TopFilePath for DiffTree and LsTree 2013-10-17 14:51:19 -04:00
Joey Hess
82ff37520f fix off-by-one 2013-10-16 12:14:14 -04:00
Joey Hess
bac078742d Deal with git check-attr -z output format change in git 1.8.5.
I have not actually tested with 1.8.5, which is not yet relesaed, but
git.git commit f7cd8c50b9ab83e084e8f52653ecc8d90665eef2 changes -z
to also apply to output, without regards to back-compat. (But with pretty
good reasons.)

New code should work with both versions, by fingerprinting for NULs and
newlines.
2013-10-15 16:05:27 -04:00
Joey Hess
f1295b5141 fix windows build 2013-10-02 20:26:00 -04:00
Joey Hess
1536ebfe47 Disable receive.denyNonFastForwards when setting up a gcrypt special remote
gcrypt needs to be able to fast-forward the master branch. If a git
repository is set up with git init --shared --bare, it gets that set, and
pushing to it will then fail, even when it's up-to-date.
2013-10-01 15:23:48 -04:00
Joey Hess
57d49a6d04 remove *>=> and >=*> ; use <$$> instead
I forgot I had <$$> hidden away in Utility.Applicative.
It allows doing the same kind of currying as does >=*>
and I found using it made the code more readable for me.

(*>=> was not used)
2013-09-27 19:58:48 -04:00
Joey Hess
e864c8d033 blind enabling gcrypt repos on rsync.net
This pulls off quite a nice trick: When given a path on rsync.net, it
determines if it is an encrypted git repository that the user has
the key to decrypt, and merges with it. This is works even when
the local repository had no idea that the gcrypt remote exists!

(As previously done with local drives.)

This commit sponsored by Pedro Côrte-Real
2013-09-27 16:21:56 -04:00
Joey Hess
1550759220 enabling rsync.net gcrypt repos
Still need to detect when the user is trying to create a repo
that already exists, and jump to the enabling code.
2013-09-26 23:47:30 -04:00
Joey Hess
735ed3b822 prep for enabling remotre gcrypt repos in webapp 2013-09-26 17:26:13 -04:00
Joey Hess
3192b059b5 add back lost check that git-annex-shell supports gcrypt 2013-09-24 17:51:12 -04:00
Joey Hess
7390f08ef9 Use cryptohash rather than SHA for hashing.
This is a massive win on OSX, which doesn't have a sha256sum normally.

Only use external hash commands when the file is > 1 mb,
since cryptohash is quite close to them in speed.

SHA is still used to calculate HMACs. I don't quite understand
cryptohash's API for those.

Used the following benchmark to arrive at the 1 mb number.

1 mb file:

benchmarking sha256/internal
mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950
std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
  4 (4.0%) high mild
  1 (1.0%) high severe
variance introduced by outliers: 10.415%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950
std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
  2 (2.0%) high mild
  1 (1.0%) high severe

2 mb file:

benchmarking sha256/internal
mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950
std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950
variance introduced by outliers: 35.540%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950
std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950
found 6 outliers among 100 samples (6.0%)

import Crypto.Hash
import Data.ByteString.Lazy as L
import Criterion.Main
import Common

testfile :: FilePath
testfile = "/run/shm/data" -- on ram disk

main = defaultMain
        [ bgroup "sha256"
                [ bench "internal" $ whnfIO internal
                , bench "external" $ whnfIO external
                ]
        ]

sha256 :: L.ByteString -> Digest SHA256
sha256 = hashlazy

internal :: IO String
internal = show . sha256 <$> L.readFile testfile

external :: IO String
external = do
	s <- readProcess "sha256sum" [testfile]
        return $ fst $ separate (== ' ') s
2013-09-22 20:06:02 -04:00
Joey Hess
006cf7976f more completely solve catKey memory leak
Done using a mode witness, which ensures it's fixed everywhere.

Fixing catFileKey was a bear, because git cat-file does not provide a
nice way to query for the mode of a file and there is no other efficient
way to do it. Oh, for libgit2..

Note that I am looking at tree objects from HEAD, rather than the index.
Because I cat-file cannot show a tree object for the index.
So this fix is technically incomplete. The only cases where it matters
are:

1. A new large file has been directly staged in git, but not committed.
2. A file that was committed to HEAD as a symlink has been staged
   directly in the index.

This could be fixed a lot better using libgit2.
2013-09-19 16:41:21 -04:00
Joey Hess
f26c996dc6 interface to parse git tree objects 2013-09-19 15:58:35 -04:00
Joey Hess
eb42bde19a sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory. 2013-09-19 14:48:42 -04:00
Joey Hess
e8e209f4e5 better probing for gcrypt repositories using new --check option
Now can tell if a repo uses gcrypt or not, and whether it's decryptable
with the current gpg keys.

This closes the hole that undecryptable gcrypt repos could have before been
combined into the repo in encrypted mode.
2013-09-19 12:53:24 -04:00
Joey Hess
8062f6337f webapp: support adding existing gcrypt special remotes from removable drives
When adding a removable drive, it's now detected if the drive contains
a gcrypt special remote, and that's all handled nicely. This includes
fetching the git-annex branch from the gcrypt repo in order to find
out how to set up the special remote.

Note that gcrypt repos that are not git-annex special remotes are not
supported. It will attempt to detect such a gcrypt repo and refuse
to use it. (But this is hard to do any may fail; see
https://github.com/blake2-ppc/git-remote-gcrypt/issues/6)

The problem with supporting regular gcrypt repos is that we don't know
what the gcrypt.participants setting is intended to be for the repo.
So even if we can decrypt it, if we push changes to it they might not be
visible to other participants.

Anyway, encrypted sneakernet (or mailnet) is now fully possible with the
git-annex assistant! Assuming that the gpg key distribution is handled
somehow, which the assistant doesn't yet help with.

This commit was sponsored by Navishkar Rao.
2013-09-18 15:55:31 -04:00
Joey Hess
6c35038643 gcrypt: Ensure that signing key is set to one of the participants keys.
Otherwise gcrypt will fail to pull, since it requires this to be the case.

This needs a patched gcrypt, which is in my forked version.
2013-09-17 16:06:29 -04:00
Joey Hess
ab9dd6d8a0 sync: Fix bug that caused direct mode mappings to not be updated when merging files into the tree on Windows. 2013-09-13 13:49:28 -04:00
Joey Hess
7c1a9cdeb9 partially complete gcrypt remote (local send done; rest not)
This is a git-remote-gcrypt encrypted special remote. Only sending files
in to the remote works, and only for local repositories.

Most of the work so far has involved making initremote work. A particular
problem is that remote setup in this case needs to generate its own uuid,
derivied from the gcrypt-id. That required some larger changes in the code
to support.

For ssh remotes, this will probably just reuse Remote.Rsync's code, so
should be easy enough. And for downloading from a web remote, I will need
to factor out the part of Remote.Git that does that.

One particular thing that will need work is supporting hot-swapping a local
gcrypt remote. I think it needs to store the gcrypt-id in the git config of the
local remote, so that it can check it every time, and compare with the
cached annex-uuid for the remote. If there is a mismatch, it can change
both the cached annex-uuid and the gcrypt-id. That should work, and I laid
some groundwork for it by already reading the remote's config when it's
local. (Also needed for other reasons.)

This commit was sponsored by Daniel Callahan.
2013-09-07 18:38:00 -04:00
Joey Hess
dad34e0ea8 add getParticipantList
Note that it needs to look at global git config, since git-remote-gcrypt
will see any setting there as a fallback.
2013-09-05 16:34:13 -04:00
Joey Hess
a48a4e2f8a automatically derive an annex-uuid from a gcrypt-uuids 2013-09-05 16:02:39 -04:00
Joey Hess
6cdac3a003 sync, assistant: Force push of the git-annex branch.
Necessary to ensure it gets pushed to remotes after being rewritten by forget.
See inline rationalles for why I think this is safe!
2013-08-29 14:27:53 -04:00
guilhem
f754779c02 Unused: bugfix
Detect staged files that are not in the working tree.
2013-08-26 13:50:09 -04:00
guilhem
f15fda60ed Speed up the 'unused' command.
Instead of populating the second-level Bloom filter with every key
referenced in every Git reference, consider only those which differ
from what's referenced in the index.

Incidentaly, unlike with its old behavior, staged
modifications/deletion/... will now be detected by 'unused'.

Credits to joeyh for the algorithm. :-)
2013-08-25 21:02:13 -04:00
guilhem
b4a32c7506 Unescape characters in 'file://...' URIs.
That allows, in Git remotes, such URIs to contain spaces or UTF-8
characters. Closes http://git-annex.branchable.com/bugs/Unable_to_use_remotes_with_space_in_the_path/ .
2013-08-22 11:33:16 -04:00
Joey Hess
6fd2935a5a unused: Pay attention to symlinks that are not yet staged in the index. 2013-08-22 10:20:03 -04:00
Joey Hess
a3224ce35b avoid more build warnings on Windows 2013-08-04 14:05:36 -04:00
Joey Hess
b191d5c595 gitignore support for the assistant and watcher
Requires git 1.8.4 or newer. When it's installed, a background
git check-ignore process is run, and used to efficiently check ignores
whenever a new file is added.

Thanks to Adam Spiers, for getting the necessary support into git for this.

A complication is what to do about files that are gitignored but have
been checked into git anyway. git commands assume the ignore has been
overridden in this case, and not need any more overriding to commit a
changed version.

However, for the assistant to do the same, it would have to run git ls-files
to check if the ignored file is in git. This is somewhat expensive. Or it
could use the running git-cat-file process to query the file that way,
but that requires transferring the whole file content over a pipe, so it
can be quite expensive too, for files that are not git-annex
symlinks.

Now imagine if the user knows that a file or directory tree will be getting
frequent changes, and doesn't want the assistant to sync it, so gitignores
it. The assistant could overload the system with repeated ls-files checks!

So, I've decided that the assistant will not automatically commit changes
to files that are gitignored. This is a tradeoff. Hopefully it won't be a
problem to adjust .gitignore settings to not ignore files you want the
assistant to autocommit, or to manually git annex add files that are listed
in .gitignore.

(This could be revisited if git-annex gets access to an interface to check
the content of the index w/o forking a git command. This could be libgit2,
or perhaps a separate git cat-file --batch-check process, so it wouldn't
need to ship over the whole file content.)

This commit was sponsored by Francois Marier. Thanks!
2013-08-02 20:37:03 -04:00
Joey Hess
672cfc3923 better git version checking 2013-08-02 18:32:26 -04:00