Commit graph

86 commits

Author SHA1 Message Date
Joey Hess
b89efc79f6 add --force overrides annex.largefiles 2013-03-29 16:20:15 -04:00
Joey Hess
67e817c6a1 New annex.largefiles setting, which configures which files git annex add and the assistant add to the annex.
I would have sort of liked to put this in .gitattributes, but it seems
it does not support multi-word attribute values. Also, making this a single
config setting makes it easy to only parse the expression once.

A natural next step would be to make the assistant `git add` files that
are not annex.largefiles. OTOH, I don't think `git annex add` should
`git add` such files, because git-annex command line tools are
not in the business of wrapping git command line tools.
2013-03-29 16:17:13 -04:00
Joey Hess
cfd3b16fe1 add section metadata to all commands
Not yet used .. mindless train work.
2013-03-24 18:28:21 -04:00
Joey Hess
06046a0d2b finish fast direct mode rename handling. wow, it's fast 2013-03-11 14:14:45 -04:00
Joey Hess
40df015d90 remove Eq instance for InodeCache
There are two types of equality here, and which one is right varies,
so this forces me to consider and choose between them.

Based on this, I learned that the commit in git anex sync was
always doing a strong comparison, even when in a repository where
the inodes had changed. Fixed that.
2013-03-11 02:57:48 -04:00
Joey Hess
cbd53b4a8c Makefile now builds using cabal, taking advantage of cabal's automatic detection of appropriate build flags.
The only thing lost is ./ghci

Speed: make fast used to take 20 seconds here, when rebuilding from
touching Command/Unused.hs. With cabal, it's 29 seconds.
2013-02-27 02:39:22 -04:00
Joey Hess
52902c0945 make adding modified files work on crippled filesystems 2013-02-20 14:12:55 -04:00
Joey Hess
af1da07302 Direct mode: Fix support for adding a modified file.
Adding a file that is already annexed, but has been modified, was broken in
direct mode.

This fix makes the new content be added. It does have the problem that
re-running `git annex add` will checksum and re-add the content repeatedly,
until it's committed. This happens because the key associated with the file
does not change until the new one gets committed, so it keeps thinking the
file has changed.
2013-02-20 13:37:46 -04:00
Joey Hess
d7c93b8913 fully support core.symlinks=false in all relevant symlink handling code
Refactored annex link code into nice clean new library.

Audited and dealt with calls to createSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:  ( liftIO $ createSymbolicLink linktarget file
  only when core.symlinks=true
Assistant/WebApp/Configurators/Local.hs:                createSymbolicLink link link
  test if symlinks can be made
Command/Fix.hs: liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/FromKey.hs:     liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/Indirect.hs:                    liftIO $ createSymbolicLink l f
  refuses to run if core.symlinks=false
Init.hs:                createSymbolicLink f f2
  test if symlinks can be made
Remote/Directory.hs:    go [file] = catchBoolIO $ createSymbolicLink file f >> return True
  fast key linking; catches failure to make symlink and falls back to copy
Remote/Git.hs:          liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True
  ditto
Upgrade/V1.hs:                          liftIO $ createSymbolicLink link f
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to readSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:		( liftIO $ catchMaybeIO $ readSymbolicLink file
  only when core.symlinks=true
Assistant/Threads/Watcher.hs:		ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file))
  code that fixes real symlinks when inotify sees them
  It's ok to not fix psdueo-symlinks.
Assistant/Threads/Watcher.hs:		mlink <- liftIO (catchMaybeIO $ readSymbolicLink file)
  ditto
Command/Fix.hs:	stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do
  command only works in indirect mode
Upgrade/V1.hs:	getsymlink = takeFileName <$> readSymbolicLink file
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to isSymbolicLink.
(Typically used with getSymbolicLinkStatus, but that is just used because
getFileStatus is not as robust; it also works on pseudolinks.)
Remaining calls are all safe, because:

Assistant/Threads/SanityChecker.hs:                             | isSymbolicLink s -> addsymlink file ms
  only handles staging of symlinks that were somehow not staged
  (might need to be updated to support pseudolinks, but this is
  only a belt-and-suspenders check anyway, and I've never seen the code run)
Command/Add.hs:         if isSymbolicLink s || not (isRegularFile s)
  avoids adding symlinks to the annex, so not relevant
Command/Indirect.hs:                            | isSymbolicLink s -> void $ flip whenAnnexed f $
  only allowed on systems that support symlinks
Command/Indirect.hs:            whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do
  ditto
Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f
  used to find unlocked files, only relevant in indirect mode
Utility/FSEvents.hs:                    | Files.isSymbolicLink s = runhook addSymlinkHook $ Just s
Utility/FSEvents.hs:                                            | Files.isSymbolicLink s ->
Utility/INotify.hs:                             | Files.isSymbolicLink s ->
Utility/INotify.hs:                     checkfiletype Files.isSymbolicLink addSymlinkHook f
Utility/Kqueue.hs:              | Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change
  all above are lower-level, not relevant

Audited and dealt with calls to isSymLink.
Remaining calls are all safe, because:

Annex/Direct.hs:			| isSymLink (getmode item) =
  This is looking at git diff-tree objects, not files on disk
Command/Unused.hs:		| isSymLink (LsTree.mode l) = do
  This is looking at git ls-tree, not file on disk
Utility/FileMode.hs:isSymLink :: FileMode -> Bool
Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode
  low-level

Done!!
2013-02-17 16:43:14 -04:00
Joey Hess
7ce30b534f add: Improved detection of files that are modified while being added.
In indirect mode, now checks the inode cache to detect changes to a file.
Note that a file can still be changed if a process has it open for write,
after landing in the annex.

In direct mode, some checking of the inode cache was done before, but
from a much later point, so fewer modifications could be detected. Now it's
as good as indirect mode.

On crippled filesystems, no lock down is done before starting to add a
file, so checking the inode cache is the only protection we have.
2013-02-14 16:54:36 -04:00
Joey Hess
a52f8f382b split out Utility.InodeCache 2013-02-14 16:17:40 -04:00
Joey Hess
47477b2807 crippled filesystem support, probing and initial support
git annex init probes for crippled filesystems, and sets direct mode, as
well as `annex.crippledfilesystem`.

Avoid manipulating permissions of files on crippled filesystems.
That would likely cause an exception to be thrown.

Very basic support in Command.Add for cripped filesystems; avoids the lock
down entirely since doing it needs both permissions and hard links.
Will make this better soon.
2013-02-14 14:15:26 -04:00
Joey Hess
43b4b7d43a can now build Android targeted binary
Various things that don't work on Android are just ifdefed out.

* the webapp (needs template haskell for arm)
* --include and --exclude globbing (needs libpcre, which is not ported;
  probably I'll make it use the pure haskell glob library instead)
* annex.diskreserve checking (missing sys/statvfs.h)
* timestamp preservation support (yawn)
* S3
* WebDAV
* XMPP

The resulting 17mb binary has been tested on Android, and it is able to,
at least, print its usage message.
2013-02-10 15:48:38 -04:00
Joey Hess
b19c2e6122 assistant: Fix location log when adding new file in direct mode. 2013-02-05 13:41:48 -04:00
Joey Hess
f51ad2a00c assistant: Avoid committer crashing if a file is deleted at the wrong instant. 2013-01-14 15:02:13 -04:00
Joey Hess
248090064d addurl in direct mode 2013-01-06 17:34:44 -04:00
Joey Hess
858ad6783b add works in direct mode
Also, changed sync to no longer automatically add files in direct mode.
That was only necessary before because add didn't work.
2013-01-06 17:24:22 -04:00
Joey Hess
15ecce2bfd squelch warning 2013-01-05 15:09:43 -04:00
Joey Hess
bf1981f60e committer: Fix a file handle leak. 2013-01-05 13:42:31 -04:00
Joey Hess
92287f6905 ensure that direct mode file is not modified while generating its key 2012-12-29 15:32:29 -04:00
Joey Hess
e872c3f648 convert notBareRepo to a CommandCheck
This avoids some small overhead by only running the check once per command;
it also ensures that, even if the command doesn't find anything to run on,
it still fails to run when in a bare repo.
2012-12-29 14:45:19 -04:00
Joey Hess
2ce736ac50 block all commands that don't work in direct mode
I left status working in direct mode, although it doesn't show correct
stats for known annex keys.
2012-12-29 14:28:19 -04:00
Joey Hess
c3a35eb857 add a guard against using git annex add in direct mode repo
Currently, it deletes files when run in one, so until I get a chance to fix
it, block foot shooting.
2012-12-24 14:54:36 -04:00
Joey Hess
c6d2bbe402 assistant adding of files in direct mode 2012-12-24 13:37:29 -04:00
Joey Hess
915cd7f676 comment 2012-12-19 12:50:24 -04:00
Joey Hess
ebd576ebcb where indentation 2012-11-12 01:05:04 -04:00
Joey Hess
e0fdfb2e70 maintain set of files pendingAdd
Kqueue needs to remember which files failed to be added due to being open,
and retry them. This commit gets the data in place for such a retry thread.

Broke KeySource out into its own file, and added Eq and Ord instances
so it can be stored in a Set.
2012-06-20 16:31:46 -04:00
Joey Hess
57cf65eb6d fix kevent symlink creation 2012-06-19 02:40:21 -04:00
Joey Hess
3dac81d345 remove newly created tmp file before linking 2012-06-15 22:19:12 -04:00
Joey Hess
e32dda07ca better temp file handling 2012-06-15 22:16:00 -04:00
Joey Hess
1bae56e4a0 tweak 2012-06-15 22:06:59 -04:00
Joey Hess
aae0ba1995 fixed the double commits problem 2012-06-10 18:41:05 -04:00
Joey Hess
0a11b35d89 extend Git.Queue to be able to queue more than simple git commands
While I was in there, I noticed and fixed a bug in the queue size
calculations. It was never encountered only because Queue.add was
only ever run with 1 file in the list.
2012-06-07 15:19:44 -04:00
Joey Hess
b819f644ad close the git add race
There's a race adding a new file to the annex: The file is moved to the
annex and replaced with a symlink, and then we git add the symlink. If
someone comes along in the meantime and replaces the symlink with
something else, such as a new large file, we add that instead. Which could
be bad..

This race is fixed by avoiding using git add, instead the symlink is
directly staged into the index.

It would be nice to make `git annex add` use this same technique.
I have not done so yet because it currently runs git update-index once per
file, which would slow does `git annex add`. A future enhancement would be
to extend the Git.Queue to include the ability to run update-index with
a list of Streamers.
2012-06-06 14:29:10 -04:00
Joey Hess
993e6459a3 factor out nukeFile 2012-06-06 13:13:13 -04:00
Joey Hess
723eb19bbf split out utility functions 2012-06-06 13:07:30 -04:00
Joey Hess
c981ccc077 add: Prevent (most) modifications from being made to a file while it is being added to the annex.
Anything that tries to open the file for write, or delete the file,
or replace it with something else, will not affect the add.

Only if a process has the file open for write before add starts
can it still change it while (or after) it's added to the annex.
(fsck will catch this later of course)
2012-06-05 20:28:34 -04:00
Joey Hess
d3cee987ca separate source of content from the filename associated with the key when generating a key
This already made migrate's code a lot simpler.
2012-06-05 19:51:03 -04:00
Joey Hess
60ab3d84e1 added ifM and nuked 11 lines of code
no behavior changes
2012-03-14 17:43:34 -04:00
Joey Hess
dc9049373e cleanup 2012-03-06 14:12:15 -04:00
Joey Hess
cbaebf538a rework git check-attr interface
Now gitattributes are looked up, efficiently, in only the places that
really need them, using the same approach used for cat-file.

The old CheckAttr code seemed very fragile, in the way it streamed files
through git check-attr.
I actually found that cad8824852
was still deadlocking with ghc 7.4, at the end of adding a lot of files.
This should fix that problem, and avoid future ones.

The best part is that this removes withAttrFilesInGit and withNumCopies,
which were complicated Seek methods, as well as simplfying the types
for several other Seek methods that had a Backend tupled in.
2012-02-13 23:52:21 -04:00
Joey Hess
8047bba5b9 add: If interrupted, add can leave files converted to symlinks but not yet added to git. Running the add again will now clean up this situtation. 2011-12-07 16:53:53 -04:00
Joey Hess
da9cd315be add support for using hashDirLower in addition to hashDirMixed
Supporting multiple directory hash types will allow converting to a
different one, without a flag day.

gitAnnexLocation now checks which of the possible locations have a file.
This means more statting of files. Several places currently use
gitAnnexLocation and immediately check if the returned file exists;
those need to be optimised.
2011-11-28 22:43:51 -04:00
Joey Hess
6869e6023e support .git/annex on a different disk than the rest of the repo
The only fully supported thing is to have the main repository on one disk,
and .git/annex on another. Only commands that move data in/out of the annex
will need to copy it across devices.

There is only partial support for putting arbitrary subdirectories of
.git/annex on different devices. For one thing, but this can require more
copies to be done. For example, when .git/annex/tmp is on one device, and
.git/annex/journal on another, every journal write involves a call to
mv(1). Also, there are a few places that make hard links between various
subdirectories of .git/annex with createLink, that are not handled.

In the common case without cross-device, the new moveFile is actually
faster than renameFile, avoiding an unncessary stat to check that a file
(not a directory) is being moved. Of course if a cross-device move is
needed, it is as slow as mv(1) of the data.
2011-11-28 16:17:55 -04:00
Joey Hess
bf460a0a98 reorder repo parameters last
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.

In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.

This also provides more opportunities to use monadic and applicative
combinators.
2011-11-08 16:27:20 -04:00
Joey Hess
3d2a9f8405 cleanup 2011-10-31 17:22:55 -04:00
Joey Hess
ef5330120c bare cleanup 2011-10-29 19:30:48 -04:00
Joey Hess
f97c783283 clean up check selection code
This new approach allows filtering out checks from the default set that are
not appropriate for a command, rather than having to list every check
that is appropriate. It also reduces some boilerplate.

Haskell does not define Eq for functions, so I had to go a long way around
with each check having a unique id. Meh.
2011-10-29 15:19:05 -04:00
Joey Hess
b955238ec7 Fail if --from or --to is passed to commands that do not support them. 2011-10-27 18:56:54 -04:00
Joey Hess
5b74b130a3 refactored and generalized pre-command sanity checking 2011-10-27 16:31:35 -04:00