Commit graph

135 commits

Author SHA1 Message Date
Joey Hess
67e817c6a1 New annex.largefiles setting, which configures which files git annex add and the assistant add to the annex.
I would have sort of liked to put this in .gitattributes, but it seems
it does not support multi-word attribute values. Also, making this a single
config setting makes it easy to only parse the expression once.

A natural next step would be to make the assistant `git add` files that
are not annex.largefiles. OTOH, I don't think `git annex add` should
`git add` such files, because git-annex command line tools are
not in the business of wrapping git command line tools.
2013-03-29 16:17:13 -04:00
Joey Hess
cfd3b16fe1 add section metadata to all commands
Not yet used .. mindless train work.
2013-03-24 18:28:21 -04:00
Joey Hess
06046a0d2b finish fast direct mode rename handling. wow, it's fast 2013-03-11 14:14:45 -04:00
Joey Hess
40df015d90 remove Eq instance for InodeCache
There are two types of equality here, and which one is right varies,
so this forces me to consider and choose between them.

Based on this, I learned that the commit in git anex sync was
always doing a strong comparison, even when in a repository where
the inodes had changed. Fixed that.
2013-03-11 02:57:48 -04:00
Joey Hess
cbd53b4a8c Makefile now builds using cabal, taking advantage of cabal's automatic detection of appropriate build flags.
The only thing lost is ./ghci

Speed: make fast used to take 20 seconds here, when rebuilding from
touching Command/Unused.hs. With cabal, it's 29 seconds.
2013-02-27 02:39:22 -04:00
Joey Hess
52902c0945 make adding modified files work on crippled filesystems 2013-02-20 14:12:55 -04:00
Joey Hess
af1da07302 Direct mode: Fix support for adding a modified file.
Adding a file that is already annexed, but has been modified, was broken in
direct mode.

This fix makes the new content be added. It does have the problem that
re-running `git annex add` will checksum and re-add the content repeatedly,
until it's committed. This happens because the key associated with the file
does not change until the new one gets committed, so it keeps thinking the
file has changed.
2013-02-20 13:37:46 -04:00
Joey Hess
d7c93b8913 fully support core.symlinks=false in all relevant symlink handling code
Refactored annex link code into nice clean new library.

Audited and dealt with calls to createSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:  ( liftIO $ createSymbolicLink linktarget file
  only when core.symlinks=true
Assistant/WebApp/Configurators/Local.hs:                createSymbolicLink link link
  test if symlinks can be made
Command/Fix.hs: liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/FromKey.hs:     liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/Indirect.hs:                    liftIO $ createSymbolicLink l f
  refuses to run if core.symlinks=false
Init.hs:                createSymbolicLink f f2
  test if symlinks can be made
Remote/Directory.hs:    go [file] = catchBoolIO $ createSymbolicLink file f >> return True
  fast key linking; catches failure to make symlink and falls back to copy
Remote/Git.hs:          liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True
  ditto
Upgrade/V1.hs:                          liftIO $ createSymbolicLink link f
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to readSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:		( liftIO $ catchMaybeIO $ readSymbolicLink file
  only when core.symlinks=true
Assistant/Threads/Watcher.hs:		ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file))
  code that fixes real symlinks when inotify sees them
  It's ok to not fix psdueo-symlinks.
Assistant/Threads/Watcher.hs:		mlink <- liftIO (catchMaybeIO $ readSymbolicLink file)
  ditto
Command/Fix.hs:	stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do
  command only works in indirect mode
Upgrade/V1.hs:	getsymlink = takeFileName <$> readSymbolicLink file
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to isSymbolicLink.
(Typically used with getSymbolicLinkStatus, but that is just used because
getFileStatus is not as robust; it also works on pseudolinks.)
Remaining calls are all safe, because:

Assistant/Threads/SanityChecker.hs:                             | isSymbolicLink s -> addsymlink file ms
  only handles staging of symlinks that were somehow not staged
  (might need to be updated to support pseudolinks, but this is
  only a belt-and-suspenders check anyway, and I've never seen the code run)
Command/Add.hs:         if isSymbolicLink s || not (isRegularFile s)
  avoids adding symlinks to the annex, so not relevant
Command/Indirect.hs:                            | isSymbolicLink s -> void $ flip whenAnnexed f $
  only allowed on systems that support symlinks
Command/Indirect.hs:            whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do
  ditto
Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f
  used to find unlocked files, only relevant in indirect mode
Utility/FSEvents.hs:                    | Files.isSymbolicLink s = runhook addSymlinkHook $ Just s
Utility/FSEvents.hs:                                            | Files.isSymbolicLink s ->
Utility/INotify.hs:                             | Files.isSymbolicLink s ->
Utility/INotify.hs:                     checkfiletype Files.isSymbolicLink addSymlinkHook f
Utility/Kqueue.hs:              | Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change
  all above are lower-level, not relevant

Audited and dealt with calls to isSymLink.
Remaining calls are all safe, because:

Annex/Direct.hs:			| isSymLink (getmode item) =
  This is looking at git diff-tree objects, not files on disk
Command/Unused.hs:		| isSymLink (LsTree.mode l) = do
  This is looking at git ls-tree, not file on disk
Utility/FileMode.hs:isSymLink :: FileMode -> Bool
Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode
  low-level

Done!!
2013-02-17 16:43:14 -04:00
Joey Hess
7ce30b534f add: Improved detection of files that are modified while being added.
In indirect mode, now checks the inode cache to detect changes to a file.
Note that a file can still be changed if a process has it open for write,
after landing in the annex.

In direct mode, some checking of the inode cache was done before, but
from a much later point, so fewer modifications could be detected. Now it's
as good as indirect mode.

On crippled filesystems, no lock down is done before starting to add a
file, so checking the inode cache is the only protection we have.
2013-02-14 16:54:36 -04:00
Joey Hess
a52f8f382b split out Utility.InodeCache 2013-02-14 16:17:40 -04:00
Joey Hess
47477b2807 crippled filesystem support, probing and initial support
git annex init probes for crippled filesystems, and sets direct mode, as
well as `annex.crippledfilesystem`.

Avoid manipulating permissions of files on crippled filesystems.
That would likely cause an exception to be thrown.

Very basic support in Command.Add for cripped filesystems; avoids the lock
down entirely since doing it needs both permissions and hard links.
Will make this better soon.
2013-02-14 14:15:26 -04:00
Joey Hess
43b4b7d43a can now build Android targeted binary
Various things that don't work on Android are just ifdefed out.

* the webapp (needs template haskell for arm)
* --include and --exclude globbing (needs libpcre, which is not ported;
  probably I'll make it use the pure haskell glob library instead)
* annex.diskreserve checking (missing sys/statvfs.h)
* timestamp preservation support (yawn)
* S3
* WebDAV
* XMPP

The resulting 17mb binary has been tested on Android, and it is able to,
at least, print its usage message.
2013-02-10 15:48:38 -04:00
Joey Hess
b19c2e6122 assistant: Fix location log when adding new file in direct mode. 2013-02-05 13:41:48 -04:00
Joey Hess
f51ad2a00c assistant: Avoid committer crashing if a file is deleted at the wrong instant. 2013-01-14 15:02:13 -04:00
Joey Hess
248090064d addurl in direct mode 2013-01-06 17:34:44 -04:00
Joey Hess
858ad6783b add works in direct mode
Also, changed sync to no longer automatically add files in direct mode.
That was only necessary before because add didn't work.
2013-01-06 17:24:22 -04:00
Joey Hess
15ecce2bfd squelch warning 2013-01-05 15:09:43 -04:00
Joey Hess
bf1981f60e committer: Fix a file handle leak. 2013-01-05 13:42:31 -04:00
Joey Hess
92287f6905 ensure that direct mode file is not modified while generating its key 2012-12-29 15:32:29 -04:00
Joey Hess
e872c3f648 convert notBareRepo to a CommandCheck
This avoids some small overhead by only running the check once per command;
it also ensures that, even if the command doesn't find anything to run on,
it still fails to run when in a bare repo.
2012-12-29 14:45:19 -04:00
Joey Hess
2ce736ac50 block all commands that don't work in direct mode
I left status working in direct mode, although it doesn't show correct
stats for known annex keys.
2012-12-29 14:28:19 -04:00
Joey Hess
c3a35eb857 add a guard against using git annex add in direct mode repo
Currently, it deletes files when run in one, so until I get a chance to fix
it, block foot shooting.
2012-12-24 14:54:36 -04:00
Joey Hess
c6d2bbe402 assistant adding of files in direct mode 2012-12-24 13:37:29 -04:00
Joey Hess
915cd7f676 comment 2012-12-19 12:50:24 -04:00
Joey Hess
ebd576ebcb where indentation 2012-11-12 01:05:04 -04:00
Joey Hess
e0fdfb2e70 maintain set of files pendingAdd
Kqueue needs to remember which files failed to be added due to being open,
and retry them. This commit gets the data in place for such a retry thread.

Broke KeySource out into its own file, and added Eq and Ord instances
so it can be stored in a Set.
2012-06-20 16:31:46 -04:00
Joey Hess
57cf65eb6d fix kevent symlink creation 2012-06-19 02:40:21 -04:00
Joey Hess
3dac81d345 remove newly created tmp file before linking 2012-06-15 22:19:12 -04:00
Joey Hess
e32dda07ca better temp file handling 2012-06-15 22:16:00 -04:00
Joey Hess
1bae56e4a0 tweak 2012-06-15 22:06:59 -04:00
Joey Hess
aae0ba1995 fixed the double commits problem 2012-06-10 18:41:05 -04:00
Joey Hess
0a11b35d89 extend Git.Queue to be able to queue more than simple git commands
While I was in there, I noticed and fixed a bug in the queue size
calculations. It was never encountered only because Queue.add was
only ever run with 1 file in the list.
2012-06-07 15:19:44 -04:00
Joey Hess
b819f644ad close the git add race
There's a race adding a new file to the annex: The file is moved to the
annex and replaced with a symlink, and then we git add the symlink. If
someone comes along in the meantime and replaces the symlink with
something else, such as a new large file, we add that instead. Which could
be bad..

This race is fixed by avoiding using git add, instead the symlink is
directly staged into the index.

It would be nice to make `git annex add` use this same technique.
I have not done so yet because it currently runs git update-index once per
file, which would slow does `git annex add`. A future enhancement would be
to extend the Git.Queue to include the ability to run update-index with
a list of Streamers.
2012-06-06 14:29:10 -04:00
Joey Hess
993e6459a3 factor out nukeFile 2012-06-06 13:13:13 -04:00
Joey Hess
723eb19bbf split out utility functions 2012-06-06 13:07:30 -04:00
Joey Hess
c981ccc077 add: Prevent (most) modifications from being made to a file while it is being added to the annex.
Anything that tries to open the file for write, or delete the file,
or replace it with something else, will not affect the add.

Only if a process has the file open for write before add starts
can it still change it while (or after) it's added to the annex.
(fsck will catch this later of course)
2012-06-05 20:28:34 -04:00
Joey Hess
d3cee987ca separate source of content from the filename associated with the key when generating a key
This already made migrate's code a lot simpler.
2012-06-05 19:51:03 -04:00
Joey Hess
60ab3d84e1 added ifM and nuked 11 lines of code
no behavior changes
2012-03-14 17:43:34 -04:00
Joey Hess
dc9049373e cleanup 2012-03-06 14:12:15 -04:00
Joey Hess
cbaebf538a rework git check-attr interface
Now gitattributes are looked up, efficiently, in only the places that
really need them, using the same approach used for cat-file.

The old CheckAttr code seemed very fragile, in the way it streamed files
through git check-attr.
I actually found that cad8824852
was still deadlocking with ghc 7.4, at the end of adding a lot of files.
This should fix that problem, and avoid future ones.

The best part is that this removes withAttrFilesInGit and withNumCopies,
which were complicated Seek methods, as well as simplfying the types
for several other Seek methods that had a Backend tupled in.
2012-02-13 23:52:21 -04:00
Joey Hess
8047bba5b9 add: If interrupted, add can leave files converted to symlinks but not yet added to git. Running the add again will now clean up this situtation. 2011-12-07 16:53:53 -04:00
Joey Hess
da9cd315be add support for using hashDirLower in addition to hashDirMixed
Supporting multiple directory hash types will allow converting to a
different one, without a flag day.

gitAnnexLocation now checks which of the possible locations have a file.
This means more statting of files. Several places currently use
gitAnnexLocation and immediately check if the returned file exists;
those need to be optimised.
2011-11-28 22:43:51 -04:00
Joey Hess
6869e6023e support .git/annex on a different disk than the rest of the repo
The only fully supported thing is to have the main repository on one disk,
and .git/annex on another. Only commands that move data in/out of the annex
will need to copy it across devices.

There is only partial support for putting arbitrary subdirectories of
.git/annex on different devices. For one thing, but this can require more
copies to be done. For example, when .git/annex/tmp is on one device, and
.git/annex/journal on another, every journal write involves a call to
mv(1). Also, there are a few places that make hard links between various
subdirectories of .git/annex with createLink, that are not handled.

In the common case without cross-device, the new moveFile is actually
faster than renameFile, avoiding an unncessary stat to check that a file
(not a directory) is being moved. Of course if a cross-device move is
needed, it is as slow as mv(1) of the data.
2011-11-28 16:17:55 -04:00
Joey Hess
bf460a0a98 reorder repo parameters last
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.

In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.

This also provides more opportunities to use monadic and applicative
combinators.
2011-11-08 16:27:20 -04:00
Joey Hess
3d2a9f8405 cleanup 2011-10-31 17:22:55 -04:00
Joey Hess
ef5330120c bare cleanup 2011-10-29 19:30:48 -04:00
Joey Hess
f97c783283 clean up check selection code
This new approach allows filtering out checks from the default set that are
not appropriate for a command, rather than having to list every check
that is appropriate. It also reduces some boilerplate.

Haskell does not define Eq for functions, so I had to go a long way around
with each check having a unique id. Meh.
2011-10-29 15:19:05 -04:00
Joey Hess
b955238ec7 Fail if --from or --to is passed to commands that do not support them. 2011-10-27 18:56:54 -04:00
Joey Hess
5b74b130a3 refactored and generalized pre-command sanity checking 2011-10-27 16:31:35 -04:00
Joey Hess
1a29b5b52e reorganize log modules
no code changes
2011-10-15 16:21:08 -04:00
Joey Hess
6a6ea06cee rename 2011-10-05 16:02:51 -04:00
Joey Hess
cfe21e85e7 rename 2011-10-04 00:59:08 -04:00
Joey Hess
ff21fd4a65 factor out Annex exception handling module 2011-10-04 00:34:04 -04:00
Joey Hess
8ef2095fa0 factor out common imports
no code changes
2011-10-03 23:29:48 -04:00
Joey Hess
5ff04bf2af tweak 2011-09-15 16:59:52 -04:00
Joey Hess
35145202d2 remove command type definitions
These were a mistake, they make the type signatures harder to read and
less flexible. The CommandSeek, CommandStart, CommandPerform, and
CommandCleanup types were a good idea, but composing them with the
parameters expected is going too far.
2011-09-15 16:50:49 -04:00
Joey Hess
9fe3c6d211 clean up params in usage display 2011-09-15 14:33:37 -04:00
Joey Hess
203148363f split groups of related functions out of Utility 2011-08-22 16:14:12 -04:00
Joey Hess
737b5d14c9 moved files around 2011-08-20 16:11:42 -04:00
Joey Hess
dede05171b addurl: --fast can be used to avoid immediately downloading the url.
The tricky part about this is that to generate a key, the file must be
present already. Worked around by adding (back) an URL key type, which
is used for addurl --fast.
2011-08-06 14:57:22 -04:00
Joey Hess
6c396a256c finished hlint pass 2011-07-15 12:47:14 -04:00
Joey Hess
ded2591124 unannex: Clean up use of git commit -a.
This was more complex than would be expected. unannex has to use git commit -a
since it's removing files from git; git commit filelist won't do.

Allow commands to be added to the Git queue that have no associated files,
and run such commands once.
2011-07-14 17:15:37 -04:00
Joey Hess
40c6ba99f5 add: Be even more robust to avoid ever leaving the file seemingly deleted.
A failure at any point after the file is annexed will result in an undo
that puts the original file back into place and wipes the location log.
2011-07-07 21:30:51 -04:00
Joey Hess
67dcc1f171 add: Avoid a failure mode that resulted in the file seemingly being deleted (content put in the annex but no symlink present). 2011-07-07 19:29:36 -04:00
Joey Hess
9f1577f746 remove unused backend machinery
The only remaining vestiage of backends is different types of keys. These
are still called "backends", mostly to avoid needing to change user interface
and configuration. But everything to do with storing keys in different
backends was gone; instead different types of remotes are used.

In the refactoring, lots of code was moved out of odd corners like
Backend.File, to closer to where it's used, like Command.Drop and
Command.Fsck. Quite a lot of dead code was removed. Several data structures
became simpler, which may result in better runtime efficiency. There should
be no user-visible changes.
2011-07-05 19:57:46 -04:00
Joey Hess
cdbcd6f495 add web special remote
Generalized LocationLog to PresenceLog, and use a presence log to record
urls for the web special remote.
2011-07-01 15:30:42 -04:00
Joey Hess
b3aaf980e4 --force will cause add, etc, to operate on ignored files. 2011-06-29 11:42:00 -04:00
Joey Hess
56bc3e95ca refactor some boilerplate 2011-05-15 02:02:46 -04:00
Joey Hess
bc51387e6d Periodically flush git command queue, to avoid boating memory usage too much.
Since the queue is flushed in between subcommand actions being run,
there should be no issues with actions that expect to queue up some stuff
and have it run after they do other stuff. So I didn't have to audit for
such assumptions.
2011-04-07 13:59:31 -04:00
Joey Hess
6634b6a6b8 imcomplete attempt at supporting lutimes(3) for BSD compat 2011-03-20 14:09:24 -04:00
Joey Hess
140a351fc5 avoid version check before running version and upgrade commands
There are two types of commands; those that access the repository and those
that don't. Sorted.
2011-03-19 18:58:49 -04:00
Joey Hess
83a9bb624b fix error throwing 2011-03-15 11:50:40 -04:00
Joey Hess
bc5c54c987 symlink touching fun
When adding files to the annex, the symlinks pointing at the annexed
content are made to have the same mtime as the original file. While git
does not preserve that information, this allows a tool like metastore to be
used with annexed files.
2011-03-14 23:00:23 -04:00
Joey Hess
fcdc4797a9 use ShellParam type
So, I have a type checked safe handling of filenames starting with dashes,
throughout the code.
2011-02-28 16:18:55 -04:00
Joey Hess
e7b557ef5d got rid of Core module
Most of it was to do with managing annexed Content, so put there
2011-01-16 16:05:05 -04:00
Joey Hess
a78b0555e1 New migrate subcommand can be used to switch files to using a different backend, safely and with no duplication of content. 2011-01-08 15:54:14 -04:00
Joey Hess
a89a6f2114 refactor in preparation for adding a git-annex-shell command 2010-12-30 15:06:26 -04:00
Joey Hess
6a5be9d53c rename some stuff and prepare to break out more into Command/* 2010-12-30 14:19:16 -04:00
Joey Hess
92e5d28ca8 precommit: Optimise to avoid calling git-check-attr more than once. 2010-11-28 14:21:30 -04:00
Joey Hess
eeae910242 finished hlinting 2010-11-22 17:51:55 -04:00
Joey Hess
da0de293d1 refactor param seeking 2010-11-11 18:54:52 -04:00
Joey Hess
fb824f7eb0 use -- before filenames when running git add, git rm, etc 2010-11-10 14:15:21 -04:00
Joey Hess
1d32d902c9 Annexed file contents are now made unwritable and put in unwriteable directories, to avoid them accidentially being removed or modified. (Thanks Josh Triplett for the idea.) 2010-11-08 19:26:37 -04:00
Joey Hess
070e8530c1 refactoring, no code changes really 2010-11-08 15:15:21 -04:00
Joey Hess
0eae5b806c broke subcommands out into separate modules 2010-11-02 19:04:24 -04:00