Commit graph

143 commits

Author SHA1 Message Date
Joey Hess
e880d0d22c replace (Key, Backend) with Key
Only fsck and reinject and the test suite used the Backend, and they can
look it up as needed from the Key. This simplifies the code and also speeds
it up.

There is a small behavior change here. Before, all commands would warn when
acting on an annexed file with an unknown backend. Now, only fsck and
reinject show that warning.
2014-04-17 18:03:39 -04:00
Joey Hess
b63276309e clean up cleanup action enumeration 2014-03-13 19:06:26 -04:00
Joey Hess
a1432bce2f Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects.
This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO
of temp object files is too annoying (and if you don't want to keep
partially transferred objects across reboots).

.git/annex/misctmp must be on the same filesystem as the git work tree,
since files are moved to there in a way that will not work cross-device,
as well as symlinked into there.

I first wanted to put the tmp objects in .git/annex/objects/tmp, but
that would pose transition problems on upgrade when partially transferred
objects existed.

git annex info does not currently show the size of .git/annex/misctemp,
since it should stay small. It would also be ok to make something clean it
out, periodically.
2014-02-26 16:52:56 -04:00
Joey Hess
3f6e4b8c7c fix all remaining -Wall warnings on Windows 2014-02-25 14:48:50 -04:00
Joey Hess
1428390300 tweak wording 2014-02-20 16:00:41 -04:00
Joey Hess
9edc3a735d fsck: Refuse to do anything if more than one of --incremental, --more, and --incremental-schedule are given, since it's not clear which option should win. 2014-02-20 15:56:45 -04:00
Joey Hess
134fdefb8c fsck: When run with --all or --unused, while .gitattributes annex.numcopies cannot be honored since it's operating on keys instead of files, make it honor the global numcopies setting, and the annex.numcopies git config setting. 2014-02-20 14:45:17 -04:00
Joey Hess
8952ccec1b windows: fix fsck --incremental to not crash
Although it is still not incremental.
2014-02-13 12:40:10 -04:00
Joey Hess
7b19c7d25b cleanup thanks to Utility.PID 2014-02-11 15:39:51 -04:00
Joey Hess
1572c460e8 avoid using openFile when withFile can be used
Potentially fixes some FD leak if an action on an opened file handle fails
for some reason. There have been some hard to reproduce reports of
git-annex leaking FDs, and this may solve them.
2014-02-03 10:19:06 -04:00
Joey Hess
1669e80e85 Windows: Avoid using unix-compat's rename, which refuses to rename directories.
Opened a bug about this: https://github.com/jystic/unix-compat/issues/10
2014-01-29 15:19:03 -04:00
Joey Hess
86ffeb73d1 reorganize some files and imports 2014-01-26 16:25:55 -04:00
Joey Hess
f7cdc40f7b reorg 2014-01-21 18:08:56 -04:00
Joey Hess
0ef282a116 numcopies cleanup, part 2
This includes several bug fixes.
2014-01-21 17:25:39 -04:00
Joey Hess
b40df4f0d0 reorganize numcopies code (no behavior changes)
Move stuff into Logs.NumCopies. Add a NumCopies newtype.

Better names for various serialization classes that are specific to one
thing or another.
2014-01-21 16:08:59 -04:00
Joey Hess
34c8af74ba fix inversion of control in CommandSeek (no behavior changes)
I've been disliking how the command seek actions were written for some
time, with their inversion of control and ugly workarounds.

The last straw to fix it was sync --content, which didn't fit the
Annex [CommandStart] interface well at all. I have not yet made it take
advantage of the changed interface though.

The crucial change, and probably why I didn't do it this way from the
beginning, is to make each CommandStart action be run with exceptions
caught, and if it fails, increment a failure counter in annex state.
So I finally remove the very first code I wrote for git-annex, which
was before I had exception handling in the Annex monad, and so ran outside
that monad, passing state explicitly as it ran each CommandStart action.

This was a real slog from 1 to 5 am.

Test suite passes.

Memory usage is lower than before, sometimes by a couple of megabytes, and
remains constant, even when running in a large repo, and even when
repeatedly failing and incrementing the error counter. So no accidental
laziness space leaks.

Wall clock speed is identical, even in large repos.

This commit was sponsored by an anonymous bitcoiner.
2014-01-20 04:57:36 -04:00
Joey Hess
011b8bc7ec pull in Win32-extras, to be able to get current process id in Windows
Fixed up a number of things that had worked around there not being a way to
get that.

Most notably, transfer info files on windows now include the process id,
since no locking is currently done. This means the file format varies
between windows and unix.
2013-12-11 00:15:10 -04:00
Joey Hess
12bc989d2d better name for continuation 2013-12-01 15:52:30 -04:00
Joey Hess
d48b00ebed Direct mode .git/annex/objects directories are no longer left writable
Because that allowed writing to symlinks of files that are not present,
which followed the link and put bad content in an object location.

fsck: Fix up .git/annex/object directory permissions.

This commit was sponsored by an anonymous bitcoin donor.
2013-11-15 14:52:03 -04:00
Joey Hess
2b6747b6a2 update for Duration type change 2013-10-08 17:36:55 -04:00
Joey Hess
b405295aee hlint
test suite still passes
2013-09-25 03:09:06 -04:00
Joey Hess
65fe2314be fsck: Fix detection and fixing of present direct mode files that are wrongly represented as standin symlinks on crippled filesystems. 2013-09-13 12:50:29 -04:00
Joey Hess
0f921307e7 mirror: New command, makes two repositories contain the same set of files.
This is a simple approach for setting up a mirroring repository.

It will work with any type of remotes.

Mirror --from is more expensive than mirror --to in general.
OTOH, mirror --from will get the file from any remote that has it, not only
the named mirror remote. And if the named mirror remote is not the fastest
available remote with a file, that can speed things up.

It would be possible to make the assistant or watch command do a more
dynamic mirroring, that didn't need to scan every time.
2013-08-20 15:46:35 -04:00
Joey Hess
93f2371e09 get rid of __WINDOWS__, use mingw32_HOST_OS
The latter is harder for me to remember, but avoids build failures in code
used by the configure program.
2013-08-02 12:27:32 -04:00
Joey Hess
04d07f2c1f --unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy. 2013-07-03 15:26:59 -04:00
Joey Hess
def7cb706f Add --all option, and support it for fsck 2013-07-03 13:12:53 -04:00
Joey Hess
a35bdcb3f2 fsck: Ensures that direct mode is used for files when it's enabled.
A common failure mode for direct mode has been for files to end up still
stored in indirect mode. While I hope that doesn't happen anymore, fsck
should deal with it.
2013-06-24 16:26:00 -04:00
Joey Hess
64f8819ae4 fix build 2013-06-17 21:30:52 -04:00
Joey Hess
9ef09587dc fsck: Avoid getting confused by Windows path separators 2013-06-17 21:18:43 -04:00
Joey Hess
98be446d02 remove workaround for old bug that was only in one release
It's causing some problem on windows, see
http://git-annex.branchable.com/bugs/windows_port_-_repo_can__39__t_pull_newly_added_files_/#comment-45df9748bba687d95e3c96b3877ea925
And only affected WORM backend, and for one release well over a year ago,
so could well be bitrotted.
2013-06-17 20:51:36 -04:00
Joey Hess
21d5489bd3 typo 2013-05-19 14:46:48 -04:00
Joey Hess
abe8d549df fix permission damage (thanks, Windows) 2013-05-11 23:54:25 -04:00
Joey Hess
3c7e30a295 git-annex now builds on Windows (doesn't work) 2013-05-11 15:03:00 -05:00
Joey Hess
763cbda14f fixup #if 0 stubs to use #ifndef mingw32_HOST_OS
That's needed in files used to build the configure program.
For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type.
I may search and replace it to use the mingw32_HOST_OS thing later.
2013-05-10 16:57:21 -05:00
Joey Hess
6c74a42cc6 stub out POSIX stuff 2013-05-10 16:29:59 -05:00
Joey Hess
22afcdf2a5 fsck: Check content of direct mode files (only when the inode cache thinks they are unmodified).
I wrote this earlier, but it never worked because it was looking at the
.git/annex/object content, which is not there..
2013-04-16 16:20:30 -04:00
Joey Hess
9e11699c76 connect existing meters to the transfer log for downloads
Most remotes have meters in their implementations of retrieveKeyFile
already. Simply hooking these up to the transfer log makes that information
available. Easy peasy.

This is particularly valuable information for encrypted remotes, which
otherwise bypass the assistant's polling of temp files, and so don't have
good progress bars yet.

Still some work to do here (see progressbars.mdwn changes), but this
is entirely an improvement from the lack of progress bars for encrypted
downloads.
2013-04-11 17:32:31 -04:00
Joey Hess
f1b0a4b404 Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories.
* since this is a crippled filesystem anyway, git-annex doesn't use
  symlinks on it
* so there's no reason to use the mixed case hash directories that we're
  stuck using to avoid breaking everyone's symlinks to the content
* so we can do what is already done for all bare repos, and make non-bare
  repos on crippled filesystems use the all-lower case hash directories
* which are, happily, all 3 letters long, so they cannot conflict with
  mixed case hash directories
* so I was able to 100% fix this and even resuming `git annex add` in the
  test case will recover and it will all just work.
2013-04-04 15:46:33 -04:00
Joey Hess
cfd3b16fe1 add section metadata to all commands
Not yet used .. mindless train work.
2013-03-24 18:28:21 -04:00
Joey Hess
a3eac50fe9 bugfix: drop --from an unavailable remote no longer updates the location log, incorrectly, to say the remote does not have the key.
The comments correctly noted that the remote could drop the key and
yet False be returned due to some problem that occurred afterwards.
For example, if it's a network remote, it could drop the key just
as the network goes down, and so things timeout and a nonzero exit
from ssh is propigated through and False returned.

However... Most of the time, this scenario will not have happened.
False will mean the remote was not available or could not drop the key
at all.

So, instead of assuming the worst, just trust the status we have.

If we get it wrong, and the scenario above happened, our location
log will think the remote has the key. But the remote's location
log (assuming it has one) will know it dropped it, and the next sync
will regain consistency.

For a special remote, with no location log, our location log will be wrong,
but this is no different than the situation where someone else dropped
the key from the remote and we've not synced with them. The standard
paranoia about not trusting the location log to be the last word about
whether a remote has a key will save us from these situations. Ie,
if we try to drop the file, we'll actively check the remote,
and determine the inconsistency then.
2013-03-10 19:15:53 -04:00
Joey Hess
921f29c004 two types of byName
Clean up from 9769235d6b.
In some cases, looking up a remote by name even though it has no UUID is
desirable. This includes git annex sync, which can operate on remotes
without an annex, and XMPP pairing, which runs addRemote (with calls
byName) before the UUID of the XMPP remote has been configured in git.
2013-03-05 15:43:56 -04:00
Joey Hess
d7c93b8913 fully support core.symlinks=false in all relevant symlink handling code
Refactored annex link code into nice clean new library.

Audited and dealt with calls to createSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:  ( liftIO $ createSymbolicLink linktarget file
  only when core.symlinks=true
Assistant/WebApp/Configurators/Local.hs:                createSymbolicLink link link
  test if symlinks can be made
Command/Fix.hs: liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/FromKey.hs:     liftIO $ createSymbolicLink link file
  command only works in indirect mode
Command/Indirect.hs:                    liftIO $ createSymbolicLink l f
  refuses to run if core.symlinks=false
Init.hs:                createSymbolicLink f f2
  test if symlinks can be made
Remote/Directory.hs:    go [file] = catchBoolIO $ createSymbolicLink file f >> return True
  fast key linking; catches failure to make symlink and falls back to copy
Remote/Git.hs:          liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True
  ditto
Upgrade/V1.hs:                          liftIO $ createSymbolicLink link f
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to readSymbolicLink.
Remaining calls are all safe, because:

Annex/Link.hs:		( liftIO $ catchMaybeIO $ readSymbolicLink file
  only when core.symlinks=true
Assistant/Threads/Watcher.hs:		ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file))
  code that fixes real symlinks when inotify sees them
  It's ok to not fix psdueo-symlinks.
Assistant/Threads/Watcher.hs:		mlink <- liftIO (catchMaybeIO $ readSymbolicLink file)
  ditto
Command/Fix.hs:	stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do
  command only works in indirect mode
Upgrade/V1.hs:	getsymlink = takeFileName <$> readSymbolicLink file
  v1 repos could not be on a filesystem w/o symlinks

Audited and dealt with calls to isSymbolicLink.
(Typically used with getSymbolicLinkStatus, but that is just used because
getFileStatus is not as robust; it also works on pseudolinks.)
Remaining calls are all safe, because:

Assistant/Threads/SanityChecker.hs:                             | isSymbolicLink s -> addsymlink file ms
  only handles staging of symlinks that were somehow not staged
  (might need to be updated to support pseudolinks, but this is
  only a belt-and-suspenders check anyway, and I've never seen the code run)
Command/Add.hs:         if isSymbolicLink s || not (isRegularFile s)
  avoids adding symlinks to the annex, so not relevant
Command/Indirect.hs:                            | isSymbolicLink s -> void $ flip whenAnnexed f $
  only allowed on systems that support symlinks
Command/Indirect.hs:            whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do
  ditto
Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f
  used to find unlocked files, only relevant in indirect mode
Utility/FSEvents.hs:                    | Files.isSymbolicLink s = runhook addSymlinkHook $ Just s
Utility/FSEvents.hs:                                            | Files.isSymbolicLink s ->
Utility/INotify.hs:                             | Files.isSymbolicLink s ->
Utility/INotify.hs:                     checkfiletype Files.isSymbolicLink addSymlinkHook f
Utility/Kqueue.hs:              | Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change
  all above are lower-level, not relevant

Audited and dealt with calls to isSymLink.
Remaining calls are all safe, because:

Annex/Direct.hs:			| isSymLink (getmode item) =
  This is looking at git diff-tree objects, not files on disk
Command/Unused.hs:		| isSymLink (LsTree.mode l) = do
  This is looking at git ls-tree, not file on disk
Utility/FileMode.hs:isSymLink :: FileMode -> Bool
Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode
  low-level

Done!!
2013-02-17 16:43:14 -04:00
Joey Hess
47477b2807 crippled filesystem support, probing and initial support
git annex init probes for crippled filesystems, and sets direct mode, as
well as `annex.crippledfilesystem`.

Avoid manipulating permissions of files on crippled filesystems.
That would likely cause an exception to be thrown.

Very basic support in Command.Add for cripped filesystems; avoids the lock
down entirely since doing it needs both permissions and hard links.
Will make this better soon.
2013-02-14 14:15:26 -04:00
Joey Hess
672f8b5b83 fsck: Detect and fix consistency errors in direct mode mapping files. 2013-01-19 14:11:23 -04:00
Joey Hess
0da2507fd6 improve direct mode fsck
An earlier commit (mislabeled) made direct mode fsck check file checksums.
While it's expected for files to change at any time in direct mode, and so
fsck cannot complain every time there's a checksum mismatch, it is possible
for it to detect when a file does not *seem* to have changed, then check
its checksum, and so detect disk corruption or other problems.

This commit improves that, by checking a second time, if the checksum
fails, that the file is still not modified, before taking action. This way,
a direct mode file can be modified while being fscked.
2013-01-08 15:07:00 -04:00
Joey Hess
174867b846 blog for yesterday 2013-01-08 12:41:09 -04:00
Joey Hess
fc63e3b660 fix a stupid typo that made fsck loop when it found bad content
Thank goodness for test suites!
2013-01-07 13:01:53 -04:00
Joey Hess
9d3e571f77 support fsck in direct mode 2013-01-06 15:42:49 -04:00
Joey Hess
2ce736ac50 block all commands that don't work in direct mode
I left status working in direct mode, although it doesn't show correct
stats for known annex keys.
2012-12-29 14:28:19 -04:00
Joey Hess
0d50a6105b whitespace fixes 2012-12-13 00:45:27 -04:00