Commit graph

1027 commits

Author SHA1 Message Date
Joey Hess
b6b34f4916
automatic conflict resolution for v6 unlocked files
Several tricky parts:

* When the conflict is just between the same key being locked and unlocked,
  the unlocked version wins, and the file is not renamed in this case.

* Need to update associated file map when conflict resolution renames
  an unlocked file.

* git merge runs the smudge filter on the conflicting file, and actually
  overwrites the file with the same content it had before, and so
  invalidates its inode cache. This makes it difficult to know when it's
  safe to remove such files as conflict cruft, without going so far as to
  compare their entire contents.

  Dealt with this by preventing the smudge filter from populating the file
  when a merge is run. However, that also prevents the smudge filter being
  run for non-conflicting files, so eg moving a file won't put its new
  content into place.

* Ideally, if a merge or a merge conflict resolution renames an unlocked
  file, the file in the work tree can just be moved, rather than copying
  the content to a new worktree file.

  This is attempted to be done in merge conflict resolution, but
  due to git merge's behavior of running smudge filters, what actually
  seems to happen is the old worktree file with the content is deleted and
  rewritten as a pointer file, so doesn't get reused.

So, this is probably not as efficient as it optimally could be.
If that becomes a problem, could look into running the merge in a separate
worktree and updating the real worktree more efficiently, similarly to the
direct mode merge. However, the direct mode merge had a lot of bugs, and
I'd rather not use that more error-prone method unless really needed.
2015-12-29 15:41:09 -04:00
Joey Hess
da5d25a844
clean build warning on windows 2015-12-28 13:06:36 -04:00
Joey Hess
4224fae71f
optimise read and write for Keys database (untested)
Writes are optimised by queueing up multiple writes when possible.
The queue is flushed after the Annex monad action finishes. That makes it
happen on program termination, and also whenever a nested Annex monad action
finishes.

Reads are optimised by checking once (per AnnexState) if the database
exists. If the database doesn't exist yet, all reads return mempty.

Reads also cause queued writes to be flushed, so reads will always be
consistent with writes (as long as they're made inside the same Annex monad).
A future optimisation path would be to determine when that's not necessary,
which is probably most of the time, and avoid flushing unncessarily.

Design notes for this commit:

- separate reads from writes
- reuse a handle which is left open until program
  exit or until the MVar goes out of scope (and autoclosed then)
- writes are queued
  - queue is flushed periodically
  - immediate queue flush before any read
  - auto-flush queue when database handle is garbage collected
  - flush queue on exit from Annex monad
    (Note that this may happen repeatedly for a single database connection;
    or a connection may be reused for multiple Annex monad actions,
    possibly even concurrent ones.)
- if database does not exist (or is empty) the handle
  is not opened by reads; reads instead return empty results
- writes open the handle if it was not open previously
2015-12-23 19:18:52 -04:00
Joey Hess
d82b110da8
Merge branch 'master' into smudge 2015-12-21 17:12:46 -04:00
Joey Hess
b6ac443b60
fix build warnings under ghc 7.10
Caused by AMP.. Since I've finally upgraded my dev laptop to 7.10,
I may start missing imports that are not needed with it but are with older
versions..
2015-12-19 17:42:45 -04:00
Joey Hess
029111b89a
Merge branch 'master' into smudge 2015-12-16 13:07:46 -04:00
Joey Hess
25bc6ea6d8
bring back some deleted functions that git-repair uses 2015-12-15 20:42:35 -04:00
Joey Hess
96dd0f4ebe
improve temp dir security
http://bugs.debian.org/807341

* Fix insecure temporary permissions when git-annex repair is used in
  in a corrupted git repository.

  Other calls to withTmpDir didn't leak any potentially private data,
  but repair clones the git repository to a temp directory which is made
  using the user's umask. Thus, it might expose a git repo that is
  otherwise locked down.

* Fix potential denial of service attack when creating temp dirs.

  Since withTmpDir used easily predictable temporary directory names,
  an attacker could create foo.0, foo.1, etc and as long as it managed to
  keep ahead of it, could prevent it from ever returning.

  I'd rate this as a low utility DOS attack. Most attackers in a position
  to do this could just fill up the disk /tmp is on to prevent anything
  from writing temp files. And few parts of git-annex use withTmpDir
  anyway, so DOS potential is quite low.

Examined all callers of withTmpDir and satisfied myself that
switching to mkdtmp and so getting a mode 700 temp dir wouldn't break any
of them.

Note that withTmpDirIn continues to not force temp dir to 700.
But it's only used for temp directories inside .git/annex/wherever/
so that is not a problem.

Also re-audited all other uses of temp files and dirs in git-annex.
2015-12-15 20:21:48 -04:00
Joey Hess
ce73a96e4e
use InodeCache when dropping a key to see if a pointer file can be safely reset
The Keys database can hold multiple inode caches for a given key. One for
the annex object, and one for each pointer file, which may not be hard
linked to it.

Inode caches for a key are recorded when its content is added to the annex,
but only if it has known pointer files. This is to avoid the overhead of
maintaining the database when not needed.

When the smudge filter outputs a file's content, the inode cache is not
updated, because git's smudge interface doesn't let us write the file. So,
dropping will fall back to doing an expensive verification then. Ideally,
git's interface would be improved, and then the inode cache could be
updated then too.
2015-12-09 17:54:54 -04:00
Joey Hess
969d54f914
cleanup 2015-12-06 16:36:35 -04:00
Joey Hess
4591569607
avoid looping trying to make temp dir when the name is too long
Only loop when directory creation fails due to the directory existing
already.
2015-12-06 16:29:36 -04:00
Joey Hess
a0fcb8ec93
generalize catchHardwareFault to catchIOErrorType 2015-12-06 16:26:38 -04:00
Joey Hess
394b66be13
import Data.Time.Format to ensure its Read instance for LocalTime is available
Seems that Utility.SafeCommand loaded something that indirectly got that
instance loaded on unix, but not on Windows recently.
2015-11-21 13:36:30 -04:00
Joey Hess
04e150abb3
use intercalate instead of MissingH's join
The two functions are identical.
2015-11-17 17:27:24 -04:00
Joey Hess
689bdae03a
reorg quickcheck to a separate module 2015-11-17 15:49:22 -04:00
Joey Hess
1244eb3770
refactor 2015-11-16 20:27:01 -04:00
Joey Hess
7943442dff
Display progress meter in -J mode when copying from a local git repo, to a local git repo, and from a remote git repo.
Had everything available, just didn't combine the progress meter with the
other places progress is sent to update it. (And to a remote repo already
did show progress.)

Most special remotes should already display progress meters with -J,
same as without it. One exception to this is the web, since it relies on
wget/curl progress display without -J. Still todo..
2015-11-16 19:32:30 -04:00
Joey Hess
c670a0642c
fix warning 2015-11-16 15:37:27 -04:00
Joey Hess
e2b4861bff
store abspath to the lock file
Avoids problems if the program chdirs
2015-11-16 15:25:04 -04:00
Joey Hess
b0626230b7
fix use of hifalutin terminology 2015-11-16 14:37:31 -04:00
Joey Hess
be86081ff4
avoid crashing in checkDaemon when fcntl locking is not supported
Instead, just assume the daemon isn't running. Since the pid file locking
fails on such a filesystem, we know it's not running.
2015-11-16 14:34:30 -04:00
Joey Hess
2e44da5c46
clean up side lock files when we're done with them
There's a potential race, but it's detected and just results in the other
process failing to take the side lock, so possibly retrying one second
later on. The race window is quite narrow so the extra delay is minor.

Left the side lock files mode 666 because an interruption can leave a side
lock file created by another user for a shared repository. When this
happens, the non-owning user can't delete it (+t) but can still lock it,
and so the code falls back to acting as it did before this commit.
2015-11-16 11:36:11 -04:00
Joey Hess
8efd3d71c8
starting to get a handle on how to detect that mad gleam in lustre's eye 2015-11-13 16:18:44 -04:00
Joey Hess
70bfe218f5
one more try to get sane behavior our of lustre 2015-11-13 15:51:45 -04:00
Joey Hess
389c6c7d37
fixed a fd double-close 2015-11-13 15:43:09 -04:00
Joey Hess
b0155d9093
also compare lock file contents to double-check link worked
And it closes the tmp file before this. I don't know if this will help
avoid lustre's craziness, but it can't hurt..
2015-11-13 15:20:52 -04:00
Joey Hess
1aba23ab4e
use /tmp for sidelock file when no /dev/shm 2015-11-13 14:49:30 -04:00
Joey Hess
60a9c7f5c6
require the side lock be held to take pidlock
This is less portable, since currently sidelocks rely on /dev/shm.
But, I've seen crazy lustre inconsistencies that make me not trust the
link() method at all, so what can you do.
2015-11-13 14:44:53 -04:00
Joey Hess
85345abe8b
avoid over-long filenames for side lock files 2015-11-13 14:04:29 -04:00
Joey Hess
c2cbe5619b
add stat check
I have a strace taken on a lustre filesystem on which link() returned 0,
but didn't actually succeed, since the file already existed.

One of the linux man pages recommended using link followed by checking like
this. I was reading it yesterday, but cannot find it now.
2015-11-13 13:22:45 -04:00
Joey Hess
88d94e674c
clean up temp file 2015-11-13 12:52:24 -04:00
Joey Hess
e31a51c5bb
better lock dropping order 2015-11-13 12:36:37 -04:00
Joey Hess
cd22340c99
generalize to MonadIO 2015-11-12 18:03:49 -04:00
Joey Hess
aa4192aea6
pid locking configuration and abstraction layer for git-annex
(not actually used anywhere yet)
2015-11-12 17:50:34 -04:00
Joey Hess
77b490bfba
add timeout for pid lock waiting 2015-11-12 17:12:54 -04:00
Joey Hess
7bd9e33b84
refactor 2015-11-12 16:35:15 -04:00
Joey Hess
0f25a7365a
module for PidLocks in LockPool 2015-11-12 16:31:34 -04:00
Joey Hess
e7552e4cee
make LockPool's LockHandle be able to support multiple different types of file locks 2015-11-12 16:28:11 -04:00
Joey Hess
710d1eeeac
module for pid lock files with atomic stale lock file takeover when possible 2015-11-12 15:39:49 -04:00
Joey Hess
08bb3b1b1d
quvi may output utf-8 encoded data when the conifigured locale doesn't support that; avoid crashing on such invalid encoding. 2015-11-09 12:19:23 -04:00
Joey Hess
8b09e9306a
merge from propellor 2015-10-28 00:18:01 -04:00
Joey Hess
268800d590
Symlink timestamp preservation code uses functions from unix-2.7.0 when available, which should be more portable. 2015-10-21 02:22:18 -04:00
Joey Hess
b9c6a56b0e
Use statvfs on OSX.
Fixes a recent-ish build warning on about 64 bit vs non.

This is the method used by the disk-free-space library, and I tested it to
yield the same results on even 10 tb drives on OSX -- so it's getting 64
bit values.
2015-10-19 17:09:06 -04:00
Joey Hess
45c9440cf9
refactor 2015-10-15 10:34:19 -04:00
Joey Hess
18c7b993bd
comment typo 2015-10-12 16:32:52 -04:00
Joey Hess
fb4a745c9b
fix export list to work on windows 2015-10-12 15:08:17 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
f52d4b684d
export FileMode type 2015-10-08 14:26:21 -04:00
Joey Hess
c8fad345f2
add tryLockShared 2015-10-08 13:40:23 -04:00
Joey Hess
9461019e9a
open lock file ReadOnly when taking shared lock
It's only necessary to open a file for write when taking an exclusive lock.
2015-10-08 13:34:49 -04:00
Joey Hess
933fef6ae0 Merge branch 'winprocfix' 2015-10-04 15:46:25 -04:00
Joey Hess
06f1f03e7a Ported disk free space checking code to work on Solaris.
On Solaris, using f_bsize provided a value that is apparently much larger
than the real block size. The solaris docs for statvfs say
f_bsize is the "preferred" file system block size, and I guess the
filesystem prefers larger blocks, but uses smaller ones or something.
The docs also say that f_frsize is the "fundamental" block size.

Switched to using f_frsize on Linux and kFreeBSD too, since I guess
f_bsize could in theory vary the same way there too. Assuming that Solaris
is not violating the posix spec, I guess the linux man page for statvfs
is not as well written and I misunderstood it.
2015-10-02 16:31:15 -04:00
Joey Hess
cdbce512bd deal with more backward-compatible breaking renamings in conduit
This is the kind of annoying thing that makes me not want to use a library.
conduitManagerSettings was a perfectly fine name and could have been kept
forever.
2015-10-02 15:18:54 -04:00
Joey Hess
9e3ac97608 avoid deprecation warnings when built with http-client >= 0.4.18
Since I want git-annex to keep building on debian stable, I need to still
support the old http-client, which required explicit calls to
closeManager, or use of withManager to get Managers to close at appropriate
times. This is not needed in the new version, and so they added a
deprecation warning. IMHO much too early, because look at the mess I had to
go through to avoid that deprecation warning while supporting both
versions..
2015-10-01 13:48:56 -04:00
Joey Hess
69d37bd894 fix bug in back-compat ifdef 2015-09-23 13:09:08 -04:00
Joey Hess
017c00c581 redundant import 2015-09-22 12:31:54 -04:00
Joey Hess
1dcb86498e Improve ~/.ssh/config modification code to not add trailing spaces to lines it cannot parse.
"Host\n" is a valid line, and actually gets parsed ok, but this also holds
for other lines that it fails to parse for some reason.
2015-09-22 12:06:10 -04:00
Joey Hess
0ebde659bf assistant: When updating ~/.ssh/config, preserve any symlinks. 2015-09-21 12:39:13 -04:00
Joey Hess
f77a873a15 improve comment 2015-09-15 13:12:21 -04:00
Joey Hess
16947ef654 Fix bug in combination of preferred and required content settings. When one was set to the empty string and the other set to some expression, this bug caused all files to be wanted, instead of only files matching the expression.
Avoid: MAny `MOr` otherexpression
Which matches anything.
2015-09-15 12:50:14 -04:00
Joey Hess
ca33921bf2 I've been not documenting these import Preludes used to deal with the AMP transition 2015-09-15 11:32:47 -04:00
Simon Jakobi
b468890f3e Silence redundant import warning with base-4.8.* 2015-09-15 11:32:23 -04:00
Joey Hess
2d2e94798f merge hlint changes from propellor 2015-09-13 13:39:48 -04:00
Joey Hess
0390efae8c support gpg.program
When gpg.program is configured, it's used to get the command to run for
gpg. Useful on systems that have only a gpg2 command or want to use it
instead of the gpg command.
2015-09-09 18:06:49 -04:00
Joey Hess
19dbe2a611 webapp: Fix support for entering password when setting up a ssh remote. 2015-09-03 11:03:08 -07:00
Joey Hess
86e638567a Fix Windows build to work with ghc 7.10
It was failing at link time, some problem with terminatePID.
Re-implemented that to not use a C wrapper function, which cleared up the
problem. Removed old EvilLinker hack with must have been related to the
same problem.

Note that I have not tested this with older ghc's. In
f11f7520b5 I mention having tried this
approach before, and getting segfaults.. So, who knows. It seems to work
fine with ghc 7.10 at least.
2015-09-01 14:51:14 -07:00
Joey Hess
7be58b5e11 make sync --no-content be accepted
It's the default, but this is a step toward changing that default later..
2015-08-20 17:21:14 -04:00
Joey Hess
0f5d6c09ac importfeed --relaxed: Avoid hitting the urls of items in the feed. 2015-08-19 12:24:55 -04:00
Joey Hess
edd1ea54e4 package qualify imports
needed for "make fast" to work
2015-08-14 17:23:25 -04:00
Joey Hess
4665fc9e84 add debug logging of process exits
This is mostly to be able to see how long a command took to run. Also exit
code may be useful.

Unofrtunately, I can't put the command name in there, because it's not
available at this point, and it would be a much larger change to wrap the
ProcessHandle data type to add that. However, it's generally pretty obvious
which process exited from context.
2015-08-13 13:12:44 -04:00
Joey Hess
e953be11af avoid throwing exception when String is not encoded using the filesystem encoding
Since _encodeFilePath generates a String that doesn't use the filesystem
encoding, when this exception is caught, we know we already have such a
String, and can just return it as-is.
2015-08-12 10:57:48 -04:00
Joey Hess
4e4e11849a fix test suite fail in LANG=C
This was caused by 23e9d3bb77

an Arbitrary String is not necessarily encoded using the filesystem
encoding, and in a non-utf8 locale, encodeBS throws an exception on such a
string. All I could think to do is limit test data to ascii.

This shouldn't be a problem in practice, because the all Strings in
git-annex that are not generated by Arbitrary should be loaded in a way
that does apply the filesystem encoding.
2015-08-12 10:36:51 -04:00
Joey Hess
23e9d3bb77 Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale.
Oh boy, not again. So, another place that the filesystem encoding needs to
be applied. Yay.

In passing, I changed decodeBS so if a NUL is embedded in the input, the
resulting FilePath doesn't get truncated at that NUL. This was needed to
make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I
didn't see any where this wouldn't make sense. When a FilePath is used to
operate on the filesystem, it'll get truncated at a NUL anyway, whereas if
a String is being used for something else, it might conceivably have a NUL
in it, and we wouldn't want it to get truncated when going through
decodeBS.
(NB: There may be a speed impact from this change.)
2015-08-11 18:40:59 -04:00
Joey Hess
0ec9bc2200 Added support for SHA3 hashed keys (in 8 varieties), when git-annex is built using the cryptonite library.
While cryptohash has SHA3 support, it has not been updated for the final
version of the spec. Note that cryptonite has not been ported to all arches
that cryptohash builds on yet.
2015-08-06 15:02:25 -04:00
Joey Hess
3cff287b26 proxy: Fix behavior when run in subdirectory of git repo.
This fixes a reversion introduced by relative path changes back last winter.

The root cause is simplifyPath "../foo" was incorrectly yielding "foo".

absPathFrom seems quite horrible. Probably most things that use it should
use </> instead.
2015-08-04 14:58:21 -04:00
Joey Hess
88e4fe6093 remove unused imports 2015-08-03 15:58:12 -04:00
Joey Hess
ea765ec022 windows build warning fixes 2015-08-03 15:54:29 -04:00
Joey Hess
6ca08f02a4 remove unused imports 2015-08-03 15:49:35 -04:00
Joey Hess
bf3e9945fc fix build warning when building with yesod 1.2 and newer yesod-core 2015-08-03 15:42:44 -04:00
Joey Hess
631557aa60 Revert "fix build warning when building with yesod 1.2"
This reverts commit 160b0ac824.
2015-08-03 15:40:04 -04:00
Joey Hess
160b0ac824 fix build warning when building with yesod 1.2 2015-08-03 13:13:57 -04:00
Joey Hess
8a547e82b1 addidential debugging 2015-08-03 11:27:53 -04:00
Joey Hess
d986d24494 analysis; forwarded 2015-08-03 11:27:27 -04:00
Joey Hess
730cc3feb5 wire tasty's option parser into the main program option parser
This makes bash completion work for git-annex test, and is
generally cleaner.
2015-07-13 13:20:10 -04:00
Joey Hess
b59b8be737 generalize parseDuration so it can be used in the ReadM monad 2015-07-08 16:08:26 -04:00
Joey Hess
4018e5f6f1 better method for running tasty's optparse as a subcommand 2015-07-08 00:39:19 -04:00
Joey Hess
625303226d import: Fix failure of cross-device import on Windows.
As well as import, 2 other places ran "mv" manually, so changed them to use
moveFile as well.
2015-07-07 14:48:23 -04:00
Joey Hess
78ef8912f8 avoid "Defined but not used" warning on android 2015-07-02 15:24:33 -04:00
Joey Hess
adba0595bd use bloom filter in second pass of sync --all --content
This is needed because when preferred content matches on files,
the second pass would otherwise want to drop all keys. Using a bloom filter
avoids this, and in the case of a false positive, a key will be left
undropped that preferred content would allow dropping. Chances of that
happening are a mere 1 in 1 million.
2015-06-16 18:50:13 -04:00
Joey Hess
a0a8127956 instance Hashable Key for bloomfilter 2015-06-16 18:37:41 -04:00
Joey Hess
a6c56fb459 improve url parsing more
Now can handle eg, "http://[::1]/download/cdrom-fontzip[foo]", where
the first [] need to stay unescaped, but the rest have to be escaped.
2015-06-14 13:54:24 -04:00
Joey Hess
829007d629 Improve url parsing to handle some urls containing illegal [] characters in their paths.
Ie, "https://archive.org/download/zoom-2/Zoom - Release 2 (1996)(Active Software)[!].iso"
2015-06-14 13:39:44 -04:00
Joey Hess
256b86b948 oh foo, I didn't mean to include this in the prev commit 2015-06-11 16:43:59 -04:00
Joey Hess
5c960601aa 4 ns optimisation of repeated calls to hasDifference on the same Differences
I want this as fast as possible, so it can be added to code paths without
slowing them down.

Avoid the set lookup, and rely on laziness,
drops runtime from 14.37 ns to 11.03 ns according to this criterion benchmark:

import Criterion.Main
import qualified Types.Difference as New
import qualified Types.DifferenceOld as Old

main :: IO ()
main = defaultMain
	[ bgroup "hasDifference"
		[ bench "new" $ whnf (New.hasDifference New.OneLevelObjectHash) new
		, bench "old" $ whnf (Old.hasDifference Old.OneLevelObjectHash) old
		]
	]
  where
	s = "fromList [ObjectHashLower, OneLevelObjectHash, OneLevelBranchHash]"
	new = New.readDifferences s
	old = Old.readDifferences s

A little bit of added boilerplate, but I suppose it's worth it to not
need to worry about set lookup overhead. Note that adding more differences
would slow down the old implementation; the new implementation will run
the same speed.
2015-06-11 16:34:35 -04:00
Joey Hess
b8f044660f two more breakages introduced when removing the Params constructor 2015-06-03 13:02:33 -04:00
Joey Hess
6688f52577 fix bug introduced in recent Params removal 2015-06-02 16:28:05 -04:00
Joey Hess
8be78783b0 Revert "When listing DBus services, also list activatable services."
This reverts commit ef0e3ac22e.

Sebastian thinks best to revert this:

It seems to me the reason I needed to look at activatable sockets
might actually be a networkd bug, and I was in error about patch 0001.
On my machines (without DHCP), networkd quits after configuring the
links. I thought this had to do with network activation, but that was
probably mistaken. This was obscured by my testing the change by doing
systemctl stop/start on networkd; now that I actually unplugged the
network cable, I noticed no DBus messages are triggered by this on
this machine. Your test case might have had a similar problem
(networkd quitting on idle). Might be related to [1].

On another machine (with DHCP) networkd remains active all the time,
and patch 0002 works there. You might want to revert 0001, though:
Suppose someone’s running no manager at all, so that polling would be
required. Because networkd is still listed as activable, we would
refrain from polling – by mistake, because networkd doesn’t seem to
actually go active if we listen on its bus, and it’s listed as
activable even when it’s not configured. Connectivity-related messages
will come in when stopping/starting the service, but not when
unplugging the cable.
2015-06-02 14:38:24 -04:00
Sebastian Reuße
ef0e3ac22e When listing DBus services, also list activatable services. 2015-06-02 12:51:13 -04:00
Joey Hess
eb33569f9d remove Params constructor from Utility.SafeCommand
This removes a bit of complexity, and should make things faster
(avoids tokenizing Params string), and probably involve less garbage
collection.

In a few places, it was useful to use Params to avoid needing a list,
but that is easily avoided.

Problems noticed while doing this conversion:

	* Some uses of Params "oneword" which was entirely unnecessary
	  overhead.
	* A few places that built up a list of parameters with ++
	  and then used Params to split it!

Test suite passes.
2015-06-01 13:52:23 -04:00
Joey Hess
ade6ed2d71 AMP hack 2015-05-31 16:54:07 -04:00