Commit graph

54 commits

Author SHA1 Message Date
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
adc5ca70a8 pre-commit: Block partial commit of unlocked annexed file, since that left a typechange staged in index
I had hoped that the git devs could change git's handling of partial
commits to not use a false index file, but seems not.

So, this relies on some git internals to detect that case. The test suite
has a test case added to catch it if changes to git break it.

This commit was sponsored by Paul Tagliamonte.
2014-11-10 15:36:24 -04:00
Joey Hess
59f88558d5 doh't use "def" for command definitions, it conflicts with Data.Default.def 2014-10-14 14:20:10 -04:00
Joey Hess
b61c6bc2ff hlint 2014-10-09 15:46:05 -04:00
Joey Hess
d279180266 reorganize and refactor lock code
Added a convenience Utility.LockFile that is not a windows/posix
portability shim, but still manages to cut down on the boilerplate around
locking.

This commit was sponsored by Johan Herland.
2014-08-20 16:45:58 -04:00
Joey Hess
092041fab0 Ensure that all lock fds are close-on-exec, fixing various problems with them being inherited by child processes such as git commands.
(With the exception of daemon pid locking.)

This fixes at part of #758630. I reproduced the assistant locking eg, a
removable drive's annex journal lock file and forking a long-running
git-cat-file process that inherited that lock.

This did not affect Windows.

Considered doing a portable Utility.LockFile layer, but git-annex uses
posix locks in several special ways that have no direct Windows equivilant,
and it seems like it would mostly be a complication.

This commit was sponsored by Protonet.
2014-08-20 11:37:02 -04:00
Joey Hess
c784ef4586 unify exception handling into Utility.Exception
Removed old extensible-exceptions, only needed for very old ghc.

Made webdav use Utility.Exception, to work after some changes in DAV's
exception handling.

Removed Annex.Exception. Mostly this was trivial, but note that
tryAnnex is replaced with tryNonAsync and catchAnnex replaced with
catchNonAsync. In theory that could be a behavior change, since the former
caught all exceptions, and the latter don't catch async exceptions.

However, in practice, nothing in the Annex monad uses async exceptions.
Grepping for throwTo and killThread only find stuff in the assistant,
which does not seem related.

Command.Add.undo is changed to accept a SomeException, and things
that use it for rollback now catch non-async exceptions, rather than
only IOExceptions.
2014-08-07 22:03:29 -04:00
Joey Hess
7dc6804154 unannex, uninit: Avoid committing after every file is unannexed, for massive speedup.
pre-commit hook lock added, so unannex can prevent the hook from running
in a confusing state.

This commit was sponsored by Fredrik Hammar
2014-03-21 14:41:05 -04:00
Joey Hess
d0fce426c4 pre-commit-annex hook script to automatically extract metadata from lots of types of files
Using the extract(1) program to do the heavy lifting.

Decided to make git-annex run pre-commit-annex when committing. Since
git-annex pre-commit also runs it, it'll be run when git commit is run too,
via the pre-commit hook. This basically gives back the pre-commit hook
that git-annex took away. The implementation avoids repeatedly looking
for the hook script when the assistant is running and committing
repeatedly; only checks if the hook is available once.

To make the script simpler, made git-annex metadata -s field?=value
only set a field when it's not already got a value.

This commit was sponsored by bak.
2014-03-02 20:11:58 -04:00
Joey Hess
1435c4f149 factor out new module 2014-02-22 13:35:50 -04:00
Joey Hess
39ebfa1a2e pre-commit: Update metadata when committing changes to annexed files within a view.
So the user can now switch to a view and then move files around within it
to manage metadata. For example, moving a file into a new directory
when in the tags=* view adds a tag to it.

Implementation is fairly efficient. One diff-index, which is no more
expensive than the first stage of a git commit, followed by possibly
some cat-file --batch traffic to find the key (when deleting a file).
Very similar to what's done in direct mode when committing. And like
direct mode when updating the WC after a merge, it has to buffer the
diff-tree values in order to make 2 passes over them.

When not in a view, pre-commit now does one extra git symbolic-ref,
which is tiny overhead.

This commit was sponsored by Andrew Eskridge.
2014-02-19 14:17:58 -04:00
Joey Hess
cce69eee4d avoid using function named that conflicts with name used in newer version of process library 2014-01-29 13:44:53 -04:00
Joey Hess
34c8af74ba fix inversion of control in CommandSeek (no behavior changes)
I've been disliking how the command seek actions were written for some
time, with their inversion of control and ugly workarounds.

The last straw to fix it was sync --content, which didn't fit the
Annex [CommandStart] interface well at all. I have not yet made it take
advantage of the changed interface though.

The crucial change, and probably why I didn't do it this way from the
beginning, is to make each CommandStart action be run with exceptions
caught, and if it fails, increment a failure counter in annex state.
So I finally remove the very first code I wrote for git-annex, which
was before I had exception handling in the Annex monad, and so ran outside
that monad, passing state explicitly as it ran each CommandStart action.

This was a real slog from 1 to 5 am.

Test suite passes.

Memory usage is lower than before, sometimes by a couple of megabytes, and
remains constant, even when running in a large repo, and even when
repeatedly failing and incrementing the error counter. So no accidental
laziness space leaks.

Wall clock speed is identical, even in large repos.

This commit was sponsored by an anonymous bitcoiner.
2014-01-20 04:57:36 -04:00
Joey Hess
03932212ec Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.

Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)
2013-12-01 13:59:45 -04:00
Joey Hess
7206dcb376 update for DiffTree change
This actually fixes a bug; if pre-commit was run in a subdir, it would pass
relative files when updating the associated file maps, and so the maps
wouldn't update.

I don't think this bug happened in practice, due to the way pre-commit is
called by the hook. It happened to chdir to the top of the work tree.
2013-10-17 14:52:12 -04:00
Joey Hess
b405295aee hlint
test suite still passes
2013-09-25 03:09:06 -04:00
Joey Hess
006cf7976f more completely solve catKey memory leak
Done using a mode witness, which ensures it's fixed everywhere.

Fixing catFileKey was a bear, because git cat-file does not provide a
nice way to query for the mode of a file and there is no other efficient
way to do it. Oh, for libgit2..

Note that I am looking at tree objects from HEAD, rather than the index.
Because I cat-file cannot show a tree object for the index.
So this fix is technically incomplete. The only cases where it matters
are:

1. A new large file has been directly staged in git, but not committed.
2. A file that was committed to HEAD as a symlink has been staged
   directly in the index.

This could be fixed a lot better using libgit2.
2013-09-19 16:41:21 -04:00
Joey Hess
eb42bde19a sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory. 2013-09-19 14:48:42 -04:00
guilhem
f15fda60ed Speed up the 'unused' command.
Instead of populating the second-level Bloom filter with every key
referenced in every Git reference, consider only those which differ
from what's referenced in the index.

Incidentaly, unlike with its old behavior, staged
modifications/deletion/... will now be detected by 'unused'.

Credits to joeyh for the algorithm. :-)
2013-08-25 21:02:13 -04:00
Joey Hess
cfd3b16fe1 add section metadata to all commands
Not yet used .. mindless train work.
2013-03-24 18:28:21 -04:00
Joey Hess
547d7745fb pre-commit: Update direct mode mappings.
Making the pre-commit hook look at git diff-index to find changed direct
mode files and update the mappings works pretty well.

One case where it does not work is when a file is git annex added, and then
git rmed, and then this is committed. That's a no-op commit, so the hook
probably doesn't even run, and it certianly never notices that the file
was deleted, so the mapping will still have the original filename in it.

For this and other reasons, it's important that the mappings still be
treated as possibly inconsistent.

Also, the assistant now allows the pre-commit hook to run when in direct
mode, so the mappings also get updated there.
2013-02-06 12:44:19 -04:00
Joey Hess
7272179979 avoid running pre-commit hook in direct mode
The code that handles committing unlocked files in indirect mode did
something unexpected and data lossy.
2013-01-17 14:11:01 -04:00
Joey Hess
f12202f771 optimize pre-commit in direct mode 2013-01-06 16:56:55 -04:00
Joey Hess
20fafc6a2d avoid pre-commit in direct mode
It was a no-op until my recent change that made lookupFile work in direct
mode.
2013-01-05 16:06:20 -04:00
Joey Hess
60ab3d84e1 added ifM and nuked 11 lines of code
no behavior changes
2012-03-14 17:43:34 -04:00
Joey Hess
cbaebf538a rework git check-attr interface
Now gitattributes are looked up, efficiently, in only the places that
really need them, using the same approach used for cat-file.

The old CheckAttr code seemed very fragile, in the way it streamed files
through git check-attr.
I actually found that cad8824852
was still deadlocking with ghc 7.4, at the end of adding a lot of files.
This should fix that problem, and avoid future ones.

The best part is that this removes withAttrFilesInGit and withNumCopies,
which were complicated Seek methods, as well as simplfying the types
for several other Seek methods that had a Backend tupled in.
2012-02-13 23:52:21 -04:00
Joey Hess
b327227ba5 better limiting of start actions to only run whenAnnexed
Mostly only refactoring, but this does remove one redundant stat of the
symlink by copy.
2011-11-10 23:45:14 -04:00
Joey Hess
f97c783283 clean up check selection code
This new approach allows filtering out checks from the default set that are
not appropriate for a command, rather than having to list every check
that is appropriate. It also reduces some boilerplate.

Haskell does not define Eq for functions, so I had to go a long way around
with each check having a unique id. Meh.
2011-10-29 15:19:05 -04:00
Joey Hess
b955238ec7 Fail if --from or --to is passed to commands that do not support them. 2011-10-27 18:56:54 -04:00
Joey Hess
5b74b130a3 refactored and generalized pre-command sanity checking 2011-10-27 16:31:35 -04:00
Joey Hess
5ff04bf2af tweak 2011-09-15 16:59:52 -04:00
Joey Hess
35145202d2 remove command type definitions
These were a mistake, they make the type signatures harder to read and
less flexible. The CommandSeek, CommandStart, CommandPerform, and
CommandCleanup types were a good idea, but composing them with the
parameters expected is going too far.
2011-09-15 16:50:49 -04:00
Joey Hess
9fe3c6d211 clean up params in usage display 2011-09-15 14:33:37 -04:00
Joey Hess
869cb82f49 remove unnecessary imports 2011-06-01 11:53:43 -04:00
Joey Hess
038da52bdd Somewhat sped up git commit of modifications to unlocked files.
Avoid git reset here too, so I no longer need to care that it's much more
expensive than seems wise (but I asked the git list about that anyway).

It's not necessary to reset the staged file content from the index, as
the `git add` of the the symlink will replace it anyway.

`git commit` of unlocked files is still slow, since git still has to shove
their entire content into the index, only to have it be thrown away. So it's
still better to use `git annex add`
2011-05-31 16:08:37 -04:00
Joey Hess
56bc3e95ca refactor some boilerplate 2011-05-15 02:02:46 -04:00
Joey Hess
bc51387e6d Periodically flush git command queue, to avoid boating memory usage too much.
Since the queue is flushed in between subcommand actions being run,
there should be no issues with actions that expect to queue up some stuff
and have it run after they do other stuff. So I didn't have to audit for
such assumptions.
2011-04-07 13:59:31 -04:00
Joey Hess
140a351fc5 avoid version check before running version and upgrade commands
There are two types of commands; those that access the repository and those
that don't. Sorted.
2011-03-19 18:58:49 -04:00
Joey Hess
bc5c54c987 symlink touching fun
When adding files to the annex, the symlinks pointing at the annexed
content are made to have the same mtime as the original file. While git
does not preserve that information, this allows a tool like metastore to be
used with annexed files.
2011-03-14 23:00:23 -04:00
Joey Hess
72d2684016 Rethink filename encoding handling for display. Since filename encoding may or may not match locale settings, any attempt to decode filenames will fail for some files. So instead, do all output in binary mode. 2011-03-12 15:30:17 -04:00
Joey Hess
fcdc4797a9 use ShellParam type
So, I have a type checked safe handling of filenames starting with dashes,
throughout the code.
2011-02-28 16:18:55 -04:00
Joey Hess
5a50a7cf13 update unicode FilePath handling
Based on http://hackage.haskell.org/trac/ghc/ticket/3307 ,
whether FilePath contains decoded unicode varies by OS.
So, add a configure check for it.

Also, renamed showFile to filePathToString
2011-02-11 15:37:37 -04:00
Michael Kenney
285fb2bb08 Fixed missing import of Messages module 2011-02-10 21:06:00 -04:00
Joey Hess
fe55b4644e Fix display of unicode filenames.
Internally, the filenames are stored as un-decoded unicode.
I tried decoding them, but then haskell tries to access the wrong files.
Hmm.

So, I've unhappily chosen option "B", which is to decode filenames before
they are displayed.
2011-02-10 14:21:44 -04:00
Joey Hess
a89a6f2114 refactor in preparation for adding a git-annex-shell command 2010-12-30 15:06:26 -04:00
Joey Hess
6a5be9d53c rename some stuff and prepare to break out more into Command/* 2010-12-30 14:19:16 -04:00
Joey Hess
92e5d28ca8 precommit: Optimise to avoid calling git-check-attr more than once. 2010-11-28 14:21:30 -04:00
Joey Hess
eeae910242 finished hlinting 2010-11-22 17:51:55 -04:00
Joey Hess
da0de293d1 refactor param seeking 2010-11-11 18:54:52 -04:00
Joey Hess
ce62f5abf1 rework command dispatching for add and pre-commit
Both subcommands do two different operations on different sets of files, so
allowing a subcommand to perform a list of operations cleans things up.
2010-11-11 17:58:55 -04:00