Commit graph

781 commits

Author SHA1 Message Date
Joey Hess
27eaa6f410
avoid making post-merge-conflict-resolution commit when no conflicts were resolved
sync, merge, assistant: When git merge failed for a reason other than a
conflicted merge, such as a crippled filesystem not allowing particular
characters in filenames, git-annex would make a merge commit that could
omit such files or otherwise be bad. Fixed by aborting the whole merge
process when git merge fails for any reason other than a merge conflict.
2015-10-15 14:22:46 -04:00
Joey Hess
9e90c033d3
Changed drop ordering when using git annex sync --content or the assistant, to drop from remotes first and from the local repo last. This works better with the behavior changes to drop in many cases. 2015-10-14 12:33:02 -04:00
Joey Hess
1ff7610118
fix windows build 2015-10-12 15:48:59 -04:00
Joey Hess
f9adb905fc
Avoid unncessary write to the location log when a file is unlocked and then added back with unchanged content.
Implemented with no additional overhead of compares etc.

This is safe to do for presence logs because of their locality of change;
a given repo's presence logs are only ever changed in that repo, or in a
repo that has just been actively changing the content of that repo.

So, we don't need to worry about a split-brain situation where there'd
be disagreement about the location of a key in a repo. And so, it's ok to
not update the timestamp when that's the only change that would be made
due to logging presence info.
2015-10-12 14:46:47 -04:00
Joey Hess
fa9333e99f
use action, not sideAction
sideAction is for things not generally related to the current action being
performed. And, it adds a newline after the side action. This was not the
right thing to use for stuff like "checksum", where doing a checksum is
part of the git annex get process, and indeed we want it to display
"(checksum...) ok"
2015-10-11 13:29:44 -04:00
Joey Hess
3b89d5a20c
implement lockContent for ssh remotes 2015-10-09 16:55:41 -04:00
Joey Hess
e392ec112f
also generate a drop safety proof for move --from remote 2015-10-09 16:16:03 -04:00
Joey Hess
6a72045707
fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal 2015-10-09 15:48:02 -04:00
Joey Hess
1043880432
improve message when drop failed due to no locked copy 2015-10-09 15:14:25 -04:00
Joey Hess
b021321aae
rename constructor 2015-10-09 15:01:33 -04:00
Joey Hess
45e1a7c361
verify local copy of content with locking 2015-10-09 14:57:32 -04:00
Joey Hess
4c6095b6f5
content locking during drop working for local git remotes
Only ssh remotes lack locking now
2015-10-09 13:12:58 -04:00
Joey Hess
ceb5819538
finish and use lockContent interface 2015-10-09 12:36:04 -04:00
Joey Hess
cf79dffa4c
improve drop proof code 2015-10-09 11:09:46 -04:00
Joey Hess
f57ac29be1
refactor 2015-10-09 10:30:22 -04:00
Joey Hess
7f5958eec2
TrustedCopy is good enough to allow dropping
By definition, a trusted repository is trusted to always have its location
tracking log accurate. Thus, it should never be in a position where content
is being dropped from it concurrently, as that would result in the location
tracking log not being accurate.
2015-10-08 18:34:48 -04:00
Joey Hess
e4a33967a1
try harder to verify until at least one VerifiedCopyLock is obtained
This avoids a failure where eg, we start with RecentlyVerifiedCopies
for all remotes, and so didn't do any active verification, which is
required.

Also, dedup the list of VerifiedCopies when checking if we have enough,
in case 2 copies of a UUID slip in.
2015-10-08 18:20:36 -04:00
Joey Hess
b17f5da6c9
require 1 locked copy while dropping from local or a remote
See doc/bugs/concurrent_drop--from_presence_checking_failures.mdwn for
discussion about why 1 locked copy is all we can require, and how this
fixes concurrent dropping bugs.

Note that, since nothing yet generates a VerifiedCopyLock yet, this commit
breaks dropping temporarily.
2015-10-08 18:11:39 -04:00
Joey Hess
c75c79864d
support invalidating existing VerifiedCopys 2015-10-08 17:58:32 -04:00
Joey Hess
90f7c4b6a2
add VerifiedCopy data type
There should be no behavior changes in this commit, it just adds a more
expressive data type and adjusts code that had been passing around a [UUID]
or sometimes a Maybe Remote to instead use [VerifiedCopy].

Although, since some functions were taking two different [UUID] lists,
there's some potential for me to have gotten it horribly wrong.
2015-10-08 16:55:11 -04:00
Joey Hess
beedf1da25
unused import 2015-10-08 14:59:34 -04:00
Joey Hess
9cb9dab69b
I think this comment is stale/confusing; remove 2015-10-08 14:51:44 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
2def1d0a23 other 80% of avoding verification when hard linking to objects in shared repo
In c6632ee5c8, it actually only handled
uploading objects to a shared repository. To avoid verification when
downloading objects from a shared repository, was a lot harder.

On the plus side, if the process of downloading a file from a remote
is able to verify its content on the side, the remote can indicate this
now, and avoid the extra post-download verification.

As of yet, I don't have any remotes (except Git) using this ability.
Some more work would be needed to support it in special remotes.

It would make sense for tahoe to implicitly verify things downloaded from it;
as long as you trust your tahoe server (which typically runs locally),
there's cryptographic integrity. OTOH, despite bup being based on shas,
a bup repo under an attacker's control could have the git ref used for an
object changed, and so a bup repo shouldn't implicitly verify. Indeed,
tahoe seems unique in being trustworthy enough to implicitly verify.
2015-10-02 14:35:12 -04:00
Joey Hess
7c7fe895f9 disabling verification also disables size verification
It's not expensive to do size verification, but let's be consistent and
turn it off too.
2015-10-02 12:38:02 -04:00
Joey Hess
c6632ee5c8 avoid verification when hard linking to objects in shared repository
Such a repository is implicitly trusted, so there's no point.
2015-10-02 12:36:03 -04:00
Joey Hess
2fb3722ce9 Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 15:56:39 -04:00
Joey Hess
b72d3fbeba rename function 2015-10-01 14:18:57 -04:00
Joey Hess
807ba6a903 refactor 2015-10-01 14:07:06 -04:00
Joey Hess
dc2f1f09b7 Improve robustness of direct mode merge, avoiding a crash if the index file is missing.
I couldn't find a good way to make an *empty* index file (zero byte file
won't do), so I punted and just don't make index.lock when there's no index
yet. This means some other git process could race and write an index file
at the same time as the merge is ongoing, in theory. Only happens in new
repos though.
2015-09-22 13:00:18 -04:00
Joey Hess
b88739f0d0 avoid auto-enabling a remote that's already enabled 2015-09-14 15:34:15 -04:00
Joey Hess
c919489c3e avoid autoenable of dead special remotes 2015-09-14 15:28:14 -04:00
Joey Hess
9cfb96c53d Special remotes configured with autoenable=true will be automatically enabled when git-annex init is run. 2015-09-14 14:49:48 -04:00
Joey Hess
97962591d6 init: Fix reversion in detection of repo made with git clone --shared 2015-09-09 13:56:37 -04:00
Joey Hess
c242e248e8 Fix reversion in init when ran as root, introduced in version 5.20150731. 2015-08-19 12:36:17 -04:00
Joey Hess
0f5d6c09ac importfeed --relaxed: Avoid hitting the urls of items in the feed. 2015-08-19 12:24:55 -04:00
Joey Hess
23e9d3bb77 Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale.
Oh boy, not again. So, another place that the filesystem encoding needs to
be applied. Yay.

In passing, I changed decodeBS so if a NUL is embedded in the input, the
resulting FilePath doesn't get truncated at that NUL. This was needed to
make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I
didn't see any where this wouldn't make sense. When a FilePath is used to
operate on the filesystem, it'll get truncated at a NUL anyway, whereas if
a String is being used for something else, it might conceivably have a NUL
in it, and we wouldn't want it to get truncated when going through
decodeBS.
(NB: There may be a speed impact from this change.)
2015-08-11 18:40:59 -04:00
Joey Hess
f7d7995172 clean 2015-08-04 17:07:45 -04:00
Joey Hess
3c971c414e sshopts is never going to be null; the concat of it may be 2015-08-04 16:53:38 -04:00
Joey Hess
a6374b7a3d typo 2015-08-04 15:44:46 -04:00
Joey Hess
f041a65c33 Windows: Fix bug that caused git-annex sync to fail due to missing environment variable.
I think that the problem was caused by windows not having a concept of an
env var that is set, but to the empty string. So, GIT_ANNEX_SSHOPTION
got set to "" and was not seen as set at all.

Easy fix, which also makes git-annex sync a little faster is to not set
GIT_SSH, when GIT_ANNEX_SSHOPTION has no options. Might as well let git use
ssh per usual in this case, no need to run git-annex as the proxy ssh
command..
2015-08-04 15:27:48 -04:00
Joey Hess
6c15cdfcb8 proxy: Fix proxy git commit of non-annexed files in direct mode.
* proxy: Fix proxy git commit of non-annexed files in direct mode.
* proxy: If a non-proxied git command, such as git revert
  would normally fail because of unstaged files in the work tree,
  make the proxied command fail the same way.
2015-08-04 14:01:59 -04:00
Joey Hess
ea765ec022 windows build warning fixes 2015-08-03 15:54:29 -04:00
Joey Hess
9dfe03dbcd Improve shutdown due to --time-limit, especially for fsck
* Perform a clean shutdown when --time-limit is reached.
  This includes running queued git commands, and cleanup actions normally
  run when a command is finished.
* fsck: Commit incremental fsck database when --time-limit is reached.
  Previously, some of the last files fscked did not make it into the
  database when using --time-limit.

Note that this changes Annex.addCleanup hooks, to run after --time-limit
expires. Fsck was using such a hook to clean up after a
--incremental-schedule, and that shouldn't run when --time-limit exipires
it. So, instead, moved that cleanup code to be run by cleanupIncremental.
Resulted in some data type juggling.
2015-07-31 16:01:54 -04:00
Joey Hess
b30324fec7 init: Detect when the filesystem is crippled such that it ignores attempts to remove the write bit from a file, and enable direct mode. Seen with eg, NTFS fuse on linux. 2015-07-30 14:06:17 -04:00
Joey Hess
267f397d82 avoid calling copy when file DNE
This avoids an ugly warning when running git annex fsck --from a rsync
remote in a repo in direct mode.
2015-07-30 13:40:17 -04:00
Joey Hess
24800b1bf1 Only look at reflogs for relevant branches, not for git-annex branches
This speeds it up quite a bit.. May still be too slow in large repos.
2015-07-07 17:36:30 -04:00
Joey Hess
b11d2f5a8a unused: --used-refspec can now be configured to look at refs in the reflog. This provides a way to not consider old versions of files to be unused after they have reached a specified age, when the old refs in the reflog expire.
May be slow.
2015-07-07 17:13:50 -04:00
Joey Hess
f7dc20595e refactor ls-tree params
All in one place to avoid bugs like 174da80ddc
2015-07-06 14:21:43 -04:00
Joey Hess
174da80ddc bugfix: Pass --full-tree when using git ls-files to get a list of files on the git-annex branch, so it works when run in a subdirectory. This bug affected git-annex unused, and potentially also transitions running code and other things. 2015-07-06 14:09:54 -04:00
Joey Hess
adba0595bd use bloom filter in second pass of sync --all --content
This is needed because when preferred content matches on files,
the second pass would otherwise want to drop all keys. Using a bloom filter
avoids this, and in the case of a false positive, a key will be left
undropped that preferred content would allow dropping. Chances of that
happening are a mere 1 in 1 million.
2015-06-16 18:50:13 -04:00
Joey Hess
a0a8127956 instance Hashable Key for bloomfilter 2015-06-16 18:37:41 -04:00
Joey Hess
8b74aec3ea Increased the default annex.bloomaccuracy from 1000 to 10000000
This makes git annex unused use around 48 mb more memory than it did before,
but the massive increase in accuracy makes this worthwhile for all but the
smallest systems.

Also, I want to use the bloom filter for sync --all --content, to avoid
dropping files that the preferred content doesn't want, and 1/1000
false positives would be far too many in that use case, even if it were
acceptable for unused.

Actual memory use numbers:

1000: 21.06user 3.42system 0:26.40elapsed 92%CPU (0avgtext+0avgdata 501552maxresident)k
1000000: 21.41user 3.55system 0:26.84elapsed 93%CPU (0avgtext+0avgdata 549496maxresident)k
10000000: 21.84user 3.52system 0:27.89elapsed 90%CPU (0avgtext+0avgdata 549920maxresident)k

Based on these numbers, 10 million seemed a better pick than 1 million.
2015-06-16 18:12:00 -04:00
Joey Hess
8c46ea22c2 Added new "anything" preferred content expression, which matches all versions of all files. 2015-06-16 17:03:34 -04:00
Joey Hess
0a998032ed Fix bug that prevented enumerating locally present objects in repos tuned with annex.tune.objecthash1=true
Need to walk 1 level of subdirs less in this case.

The git-annex branch traversal code didn't have a similar bug.
2015-06-11 15:15:05 -04:00
Joey Hess
de3bd11a2c import --clean-duplicates: Fix bug that didn't count local or trusted repo's copy of a file as one of the necessary copies to allow removing it from the import location. 2015-06-03 13:15:38 -04:00
Joey Hess
d28e8fbfd5 get --incomplete: New option to resume any interrupted downloads. 2015-06-02 14:20:38 -04:00
Joey Hess
eb33569f9d remove Params constructor from Utility.SafeCommand
This removes a bit of complexity, and should make things faster
(avoids tokenizing Params string), and probably involve less garbage
collection.

In a few places, it was useful to use Params to avoid needing a list,
but that is easily avoided.

Problems noticed while doing this conversion:

	* Some uses of Params "oneword" which was entirely unnecessary
	  overhead.
	* A few places that built up a list of parameters with ++
	  and then used Params to split it!

Test suite passes.
2015-06-01 13:52:23 -04:00
Joey Hess
a6d54e49a0 sync, remotedaemon: Pass configured ssh-options even when annex.sshcaching is disabled. 2015-05-30 22:01:52 -04:00
Joey Hess
83b262f1b6 fix windows build 2015-05-22 13:54:54 -04:00
Joey Hess
167539a354 better memoize core.sharedrepository handling
It was memoized, but that was not used consistently. Move it to
Types.GitConfig so it will auto-memoize.
2015-05-19 15:04:24 -04:00
Joey Hess
b47c9fd587 honor core.sharedRepository settings in lockContent
The content file may not be owned by the user running git-annex, in which
case, setting the owner write bit was not enough to let lockContent
act on the file. However, with some core.sharedRepository configs, the file
should be writable by the user's group. So, the thing to do is to call
thawContent on it.
2015-05-19 14:53:19 -04:00
Joey Hess
f4e2093760 fix inAnnexSafe result for direct file that is being dropped
It was returning Just False in this situation, which differed from indirect
mode behavior. I don't think this led to any actual problems; things that
checked if the file being dropped was present just failed to fail, and
instead reported it wasn't present, possibly incorrectly.

Hmm, it's possible that this could have made git annex fsck --from remote
update the location log wrongly, if a remote was in direct mode, and was in
the middle of trying to drop a key, and the drop later failed.
2015-05-19 14:26:07 -04:00
Joey Hess
1312e721ed convert lockContent to use new LockPools
Also cleaned up the code, avoiding creating a lock file if we're going to
open it for create later anyway.

And, if there's an exception while preparing to lock the file, but not at
the point of actually taking the lock, throw an exception, instead of
silently not locking and pretending to succeed.

And, on Windows, always use lock file, even if the repo somehow got into
indirect mode (maybe with cygwin git..)
2015-05-19 14:12:23 -04:00
Joey Hess
ecb0d5c087 use lock pools throughout git-annex
The one exception is in Utility.Daemon. As long as a process only
daemonizes once, which seems reasonable, and as long as it avoids calling
checkDaemon once it's already running as a daemon, the fcntl locking
gotchas won't be a problem there.

Annex.LockFile has it's own separate lock pool layer, which has been
renamed to LockCache. This is a persistent cache of locks that persist
until closed.

This is not quite done; lockContent stil needs to be converted.
2015-05-19 14:09:52 -04:00
Joey Hess
7ebf234616 Stale transfer lock and info files will be cleaned up automatically when get/unused/info commands are run.
Deleting lock files is tricky, tricky stuff. I think I got it right!
2015-05-12 20:11:23 -04:00
Joey Hess
7299bbb639 don't clean up transfer lock file when retrying transfer
This affected callers that used forwardRetry; if the 1st attempt failed it
would clean up the transfer lock before retrying.
2015-05-12 19:43:24 -04:00
Joey Hess
8c2dd7d8ee Fix an unlikely race that could result in two transfers of the same key running at once.
As discussed in bug report.
2015-05-12 19:39:28 -04:00
Joey Hess
e25ecab7dd convert to using Utility.Lockfile for transfer lock files
Should be no behavior changes, just simplified code.

The only actual difference is it doesn't truncate the lock file.
I think that was a holdover from when transfer info was written to the lock
file.
2015-05-12 19:36:16 -04:00
Joey Hess
61ccf95004 Avoid accumulating transfer failure log files unless the assistant is being used.
Only the assistant uses these, and only the assistant cleans them up, so
make only git annex transferkeys write them,

There is one behavior change from this. If glacier is being used, and a
manual git annex get --from glacier fails because the file isn't available
yet, the assistant will no longer later see that failed transfer file and
retry the get. Hope no-one depended on that old behavior.
2015-05-12 15:53:38 -04:00
Joey Hess
a812d598ef Take space that will be used by running downloads into account when checking annex.diskreserve. 2015-05-12 15:20:22 -04:00
Joey Hess
e27b97d364 Merge branch 'master' into concurrentprogress
Conflicts:
	Command/Fsck.hs
	Messages.hs
	Remote/Directory.hs
	Remote/Git.hs
	Remote/Helper/Special.hs
	Types/Remote.hs
	debian/changelog
	git-annex.cabal
2015-05-12 13:23:22 -04:00
Joey Hess
64a4553e0b rename traverse to walk since Data.Traversable is imported by default in ghc 7.10 2015-05-10 16:43:09 -04:00
Joey Hess
08308dc9b3 fix build warning with ghc 7.10 2015-05-10 15:28:13 -04:00
Joey Hess
9f3e51dd51 move nubbing into function whose algo needs a nubbed list 2015-04-30 14:11:59 -04:00
Joey Hess
38c458b407 refactor 2015-04-30 14:02:56 -04:00
Joey Hess
5948c148fb Make repo init more robust.
The setDifferences that got added to initialize turns out to make a git
commit, and before ensureCommit has been used. Thus, repo init can fail
when the system has a broken hostname etc.

Move the ensureCommit to the very first thing to avoid this kind of breakage.
2015-04-20 14:01:41 -04:00
Joey Hess
3a078ab357 When a key's size is unknown, still check the annex.diskreserve, and avoid getting content if the disk is too full.
We can't check if there's enough disk space to download the content,
but we *can* check if there's certainly not enough!
2015-04-17 21:29:15 -04:00
Joey Hess
86a2f9dc4d Merge branch 'master' into concurrentprogress
Conflicts:
	debian/changelog
2015-04-14 15:35:15 -04:00
Joey Hess
2b79e6fe08 a few hlints 2015-04-11 00:10:34 -04:00
Joey Hess
9971c82ead refactor 2015-04-10 17:53:58 -04:00
Joey Hess
8077ccbd54 get, move, copy, mirror: Concurrent downloads and uploads are now supported!
This works, and seems fairly robust. Clean get of 20 files at -J3. At -J10,
there are some messages about ssh multiplexing, probably due to a race
spinning up the ssh connection cacher. But, it manages to get all the files
ok regardless.

The progress bars are a scrambled mess though, due to bugs in
ascii-progress, which I've already filed. Particularly this one:
https://github.com/yamadapc/haskell-ascii-progress/issues/8
2015-04-10 17:08:07 -04:00
Joey Hess
0880c8319e simplify and make more atomic 2015-04-10 15:16:17 -04:00
Joey Hess
ce0a82f493 contentlocationn: New plumbing command. 2015-04-09 15:34:47 -04:00
Joey Hess
b99b8d5d4c followup to bug I cannot reproduce, and analysis based presumptive fix 2015-04-09 14:03:44 -04:00
Joey Hess
42e46a8701 avoid using --literal-pathspecs with git older than 1.8.1 which added it
Windows is still building with an older git.
2015-04-06 13:46:11 -04:00
Joey Hess
1d57f142f1 Merge branch 'concurrentprogress' 2015-04-04 15:01:00 -04:00
Joey Hess
2343f99c85 well along the way to fully quiet --quiet
Came up with a generic way to filter out progress messages while keeping
errors, for commands that use stderr for both.

--json mode will disable command outputs too.
2015-04-04 14:34:03 -04:00
Joey Hess
ff2eeaf054 avoid progress bar for url download with --quiet 2015-04-03 20:38:56 -04:00
Joey Hess
bd110516c0 init: Improve fifo test to detect NFS systems that support fifos but not well enough for sshcaching.
ssh tries to hard link a fifo, and if not, complains:

muxserver_listen: link mux listener .git/annex/ssh/SHARD1@iabak.archiveteam.org.QK8zOCbtNebI7q54 => .git/annex/ssh/SHARD1@iabak.archiveteam.org: Operation not permitted
2015-04-03 14:57:10 -04:00
Joey Hess
0a6933771d cleanup 2015-03-30 19:55:35 -04:00
Joey Hess
15d45186cc use --literal-pathspecs globally, as a better way to avoid globbing
This might be overkill; I only know I need it in ls-files, but other git
commands can also do their own globbing, it turns out, and I am pretty sure
I never want them too when git-annex is using them as plumbing.

Test suite still passes and it looks ok.
2015-03-30 19:44:13 -04:00
Joey Hess
5be536e523 Fix bug introduced in the last release that broke git-annex sync when git-annex was installed from the standalone tarball.
This was introduced by commit 450ee53ab6

However, the same problem could affect other calls to programPath,
specifically some on the assistant. So, I fixed it at a deeper level.
2015-03-27 12:55:18 -04:00
Joey Hess
3af4691978 Improve error message when --in @date is used and there is no reflog for the git-annex branch. 2015-03-26 11:15:15 -04:00
Joey Hess
798da6cf2e Added a post-update-annex hook, which is run after the git-annex branch is updated. Needed for git update-server-info.
See https://github.com/datalad/datalad/issues/1#issuecomment-84094406
2015-03-20 14:52:58 -04:00
Joey Hess
cf903d5a3c fixup annex link target calculation when submodules are used in filesystems not supporting symlinks 2015-03-04 16:08:41 -04:00
Joey Hess
e322826e33 Submodules are now supported by git-annex!
Seems to work, but still experimental until it's been tested more.

When repositories are on filesystems not supporting symlinks, the .git dir
symlink trick cannot be used. Since we're going to be in direct mode
anyway, the .git dir symlink is not strictly needed.

However, I have not fixed the code that creates new annex symlinks to
handle this case -- the committed symlinks will be wrong.

git annex sync happens to currently fail in a submodule using direct mode,
because there's no HEAD ref. That also needs to be dealt with to get
this fully working in crippled filesystems.

Leaving http://github.com/datalad/datalad/issues/44 open until these issues
are dealt with.
2015-03-02 16:43:44 -04:00
Joey Hess
450ee53ab6 When re-execing git-annex, use current program location, rather than ~/.config/git-annex/program, when possible.
Most of the time, there will be no discreprancy between programPath and
readProgramFile.

But, the programFile might have been written by an old version of git-annex
that is still installed, while a newer one is currently running. In this
case, we want to run the same one that's currently running.

This is especially important for things like the GIT_SSH=git-annex used for
ssh connection caching.

The only code that still uses readProgramFile directly is the upgrade code,
which needs to know where the standalone git-annex was installed, in order to
upgrade it.
2015-02-28 17:23:13 -04:00
Joey Hess
b9275b65f9 make programPath return FilePath not Maybe FilePath
Looking at the few current callers, it's ok to have programPath throw an
exception, in the unusual case where it cannot find git-annex.
2015-02-28 16:59:52 -04:00
Joey Hess
afb3e3e472 avoid crash when starting fsck --incremental when one is already running
Turns out sqlite does not like having its database deleted out from
underneath it. It might suffice to empty the table, but I would rather
start each fsck over with a new database, so I added a lock file, and
running incremental fscks use a shared lock.

This leaves one concurrency bug left; running two concurrent fsck --more
will lead to: "SQLite3 returned ErrorBusy while attempting to perform step."
and one or both will fail. This is a concurrent writers problem.
2015-02-17 13:30:24 -04:00
Joey Hess
15107d2c5a propigate ssh-options everywhere ssh caching is used
* sync: Use the ssh-options git config when doing git pull and push.
* remotedaemon: Use the ssh-options git config.

Note that the rename env var means that if a new git-annex calls an old one
for git-annex ssh, or a new calls an old, nothing much will go wrong;
just ssh caching won't happen.
2015-02-12 16:14:53 -04:00
Joey Hess
5be7ba7ee5 The ssh-options git config is now used by gcrypt, rsync, and ddar special remotes that use ssh as a transport. 2015-02-12 15:44:10 -04:00
Joey Hess
7fce85adac Improve race recovery code when committing to git-annex branch. 2015-02-09 18:34:48 -04:00
Joey Hess
b94eb9b22c relFile does not have to be relative; rename to currFile 2015-02-06 16:03:02 -04:00
Joey Hess
c8163ce29a use a Set 2015-01-28 18:17:10 -04:00
Joey Hess
b0575c621f implement annex.tune.branchhash1
I hope this doesn't impact speed much -- it does have to pull out a value
from Annex state every time it accesses the branch now.

The test case I dropped has never caught any problems that I can remember,
and would have been rather difficult to convert.
2015-01-28 17:17:26 -04:00
Joey Hess
009bd050c1 implement annex.tune.objecthashlower
Split out Annex.DirHashes which never really belonged in Locations.
2015-01-28 16:52:08 -04:00
Joey Hess
e8c376e0ad import Data.Default in Common 2015-01-28 16:11:28 -04:00
Joey Hess
ba3825441c rework Differences data type
Eliminated complexity and future proofed. The most important change is that
all functions over Difference are now total; any Difference that can be
expressed should be handled. Avoids needs for sanity checking of inputs,
and version skew with the future.

Also, the difference.log now serializes a [Difference], not a Differences.
This saves space and keeps it simpler.

Note that [Difference] might contain conflicting differences (eg,
[Version5, Version6]. In this case, one of them needs to consistently win
over the others, probably based on Ord.
2015-01-28 13:50:02 -04:00
Joey Hess
70736d2b41 Repository tuning parameters can now be passed when initializing a repository for the first time.
* init: Repository tuning parameters can now be passed when initializing a
  repository for the first time. For details, see
  http://git-annex.branchable.com/tuning/
* merge: Refuse to merge changes from a git-annex branch of a repo
  that has been tuned in incompatable ways.
2015-01-27 17:38:06 -04:00
Joey Hess
f50b6779f9 Fix default repository description created by git annex init, which got broken by the relative path changes in the last release. 2015-01-22 14:59:57 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
068aaf943b on second thought, InodeCache should use getFileSize
This is necessary for interop between inode caches created on unix and
windows. Which is more important than supporting inodecaches for large keys
with the wrong size, which are broken anyway.

There should be no slowdown from this change, except on Windows.
2015-01-20 19:35:50 -04:00
Joey Hess
4f657aa14e add getFileSize, which can get the real size of a large file on Windows
Avoid using fileSize which maxes out at just 2 gb on Windows.
Instead, use hFileSize, which doesn't have a bounded size.
Fixes support for files > 2 gb on Windows.

Note that the InodeCache code only needs to compare a file size,
so it doesn't matter it the file size wraps. So it has been
left as-is. This was necessary both to avoid invalidating existing inode
caches, and because the code passed FileStatus around and would have become
more expensive if it called getFileSize.

This commit was sponsored by Christian Dietrich.
2015-01-20 17:09:24 -04:00
Joey Hess
6035f94666 Windows: Fix running of the pre-commit-annex hook. 2015-01-20 14:48:16 -04:00
Joey Hess
f4de021a54 convert parentDir to be based on takeDirectory, but fixed for trailing / 2015-01-09 14:26:52 -04:00
Joey Hess
3bab5dfb1d revert parentDir change
Reverts 965e106f24

Unfortunately, this caused breakage on Windows, and possibly elsewhere,
because parentDir and takeDirectory do not behave the same when there is a
trailing directory separator.
2015-01-09 13:11:56 -04:00
Joey Hess
184ad45b42 Merge branch 'master' into relativepaths 2015-01-06 21:10:01 -04:00
Joey Hess
d7f1449b2b fix view generation code to work when run in a subdirectory; no longer needs to setCurrentDirectory to top of repo 2015-01-06 21:01:05 -04:00
Joey Hess
858d776352 Merge branch 'master' into relativepaths
Conflicts:
	Locations.hs
	debian/changelog
2015-01-06 19:00:01 -04:00
Joey Hess
965e106f24 made parentDir return a Maybe FilePath; removed most uses of it
parentDir is less safe than takeDirectory, especially when working
with relative FilePaths. It's really only useful in loops that
want to terminate at /

This commit was sponsored by Audric SCHILTKNECHT.
2015-01-06 18:55:56 -04:00
Joey Hess
8a1c5956eb absolute path to index file; test suite passes
There are still known problems; for example git annex view a=b fails when
run in a subdir of the repo.
2015-01-06 17:34:02 -04:00
Joey Hess
d8a2f658dd direct mode merge relative path trickiness
This fixes 9 test suite failures. There are some tricky things going on
with the paths to the index file, and git's working directory, which
are hard to get right with relative paths. So, I switched back to absolute
here, at least for now.

Only 2 test suite failures remain on this branch, but there are other
potential problems the test suite doesn't catch. Including some calls to
setCurrentDirectory -- I was wrong and git-annex does do that in a few
places, like when generating a view.
2015-01-06 17:18:12 -04:00
Joey Hess
cd865c3b8f Switch to using relative paths to the git repository.
This allows the git repository to be moved while git-annex is running in
it, with fewer problems.

On Windows, this avoids some of the problems with the absurdly small
MAX_PATH of 260 bytes. In particular, git-annex repositories should
work in deeper/longer directory structures than before. See
http://git-annex.branchable.com/bugs/__34__git-annex:_direct:_1_failed__34___on_Windows/

There are several possible ways this change could break git-annex:

1. If it changes its working directory while it's running, that would
   be Bad News. Good news everyone! git-annex never does so. It would also
   break thread safety, so all such things were stomped out long ago.

2. parentDir "." -> "" which is not a valid path. I had to fix one
   instace of this, and I should probably wipe all calls to parentDir out
   of the git-annex code base; it was never a good idea.

3. Things like relPathDirToFile require absolute input paths,
   and code assumes that the git repo path is absolute and passes it to it
   as-is. In the case of relPathDirToFile, I converted it to not make
   this assumption.

Currently, the test suite has 16 failures.
2015-01-06 16:19:41 -04:00
Joey Hess
a4cf80f460 Windows: Fix handling of views of filenames containing '%' 2014-12-30 17:48:04 -04:00
Joey Hess
402bfff665 fix test case on windows
"a:" is an absolute path, so viewedfile test cannot be run on it.
2014-12-30 16:04:06 -04:00
Joey Hess
33f1062bc3 Revert "temporary debugging code for windows autobuilder test suite failure"
This reverts commit 0d9fbd18c1.
2014-12-30 15:18:38 -04:00
Joey Hess
0d9fbd18c1 temporary debugging code for windows autobuilder test suite failure 2014-12-30 15:17:51 -04:00
Joey Hess
c9a3e80d32 fixed all remaining build warnings on Windows 2014-12-29 17:30:20 -04:00
Joey Hess
7e422269a6 move dummy uuids to Annex.UUID 2014-12-17 13:57:52 -04:00
Joey Hess
7ae16bb6f7 Revert "let url claims optionally include a suggested filename"
This reverts commit 85df9c30e9.

Putting filename in the claim was a bad idea.
2014-12-11 14:09:57 -04:00
Joey Hess
85df9c30e9 let url claims optionally include a suggested filename 2014-12-11 12:47:57 -04:00
Joey Hess
6ecd3ff421 diffdriver: New git-annex command, to make git external diff drivers work with annexed files.
Closes https://github.com/datalad/datalad/issues/18
2014-11-24 16:14:06 -04:00
Joey Hess
864086a956 proxy: for all your direct mode repository munging needs
This allows bypassing the direct mode guard in a safe way to do all sorts
of things including git revert, git mv, git checkout ...

This commit was sponsored by the WikiMedia Foundation.
2014-11-12 15:51:46 -04:00
Joey Hess
5ccc2a2d7c no longer used imports 2014-11-06 14:18:38 -04:00
Joey Hess
334f366979 Remove fixup code for bad bare repositories created by versions 5.20131118 through 5.20131127. That fixup code would accidentially fire when --git-dir was incorrectly pointed at the working tree of a git-annex repository, resulting in data loss. Closes: #768093 2014-11-04 18:04:19 -04:00
Joey Hess
0f6aaf8012 Windows: Fix crash when user.name is not set in git config. 2014-10-31 16:14:12 -04:00
Joey Hess
4edfda59c0 fix windows build 2014-10-16 15:48:30 -04:00
Joey Hess
1e59df083d Use haskell setenv library to clean up several ugly workarounds for inability to manipulate the environment on windows.
Didn't know that this library existed!

This includes making git-annex not re-exec itself on start on windows, and
making the test suite on Windows run tests without forking.
2014-10-15 20:33:52 -04:00
Joey Hess
db9121ecee vicfg: Deleting configurations now resets to the default, where before it has no effect.
Added a Default instance for TrustLevel, and was able to use that to clear
up several other parts of the code too.

This commit was sponsored by Stephan Schulz
2014-10-14 14:15:07 -04:00
Joey Hess
9fd95d9025 indent with tabs not spaces
Found these with:
git grep "^  " $(find -type  f -name \*.hs) |grep -v ':  where'

Unfortunately there is some inline hamlet that cannot use tabs for
indentation.

Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm
leaving it as-is.
2014-10-09 15:09:26 -04:00
Joey Hess
7b50b3c057 fix some mixed space+tab indentation
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.

Done as a background task while listening to episode 2 of the Type Theory
podcast.
2014-10-09 15:09:11 -04:00
Joey Hess
0598412e5c Fix transfer lock file FD leak that could occur when two separate git-annex processes were both working to perform the same set of transfers. 2014-09-11 13:53:26 -04:00
Joey Hess
b874f84086 New annex.hardlink setting. Closes: #758593
* New annex.hardlink setting. Closes: #758593
* init: Automatically detect when a repository was cloned with --shared,
  and set annex.hardlink=true, as well as marking the repository as
  untrusted.

Had to reorganize Logs.Trust a bit to avoid a cycle between it and
Annex.Init.
2014-09-05 13:44:09 -04:00
Joey Hess
6eb5c3f479 Do not preserve permissions and acls when copying files from one local git repository to another. Timestamps are still preserved as long as cp --preserve=timestamps is supported.
This avoids cp -a overriding the default mode acls that the user might have
set in a git repository.

With GNU cp, this behavior change should not be a breaking change, because
git-anex also uses rsync sometimes in the same situation, and has only ever
preserved timestamps when using rsync.

Systems without GNU cp will no longer use cp -a, but instead just cp.
So, timestamps will no longer be preserved. Preserving timestamps when
copying between repos is not guaranteed anyway.

Closes: #729757
2014-08-26 17:10:25 -07:00
Joey Hess
2b234634f6 fix imports for windows 2014-08-23 16:27:24 -07:00
Joey Hess
aebcc395ff use types to enforce that removeAnnex can only be called inside lockContent
This fixed one bug where it needed to be and wasn't (in Assistant.Unused).
And also found one place where lockContent was used unnecessarily (by
drop --from remote).

A few other places like uninit probably don't really need to lockContent,
but it doesn't hurt to do call it anyway.

This commit was sponsored by David Wagner.
2014-08-20 20:13:47 -04:00
Joey Hess
1994771215 more lock file refactoring
Also fixes a test suite failures introduced in recent commits, where
inAnnexSafe failed in indirect mode, since it tried to open the lock file
ReadWrite. This is why the new checkLocked opens it ReadOnly.

This commit was sponsored by Chad Horohoe.
2014-08-20 18:58:14 -04:00
Joey Hess
e386e26ef2 avoid trying to create a content file in order to lock it
The nice refactoring in ec7dd0446a
highlighted a bug in lockContent -- when the content is not present,
this incorrectly created an empty lock file, using the same filename
as the content file.

This seems like it could result in empty objects, which fsck would detect
and complain about. Both drop and move --to call lockContent, as does
Remote.Git.dropKey -- I think we got lucky and this bug didn't show up
because both all of those only operate on files that are present. So
this bug could only manifest if there was a race, and a file's content
was dropped at just the wrong time, just as another process was about to
drop it. (And then only if the other process's dropping failed, otherwise
it'd delete the empty object file.)

Hmm, move --from also called lockContent. Unnecessarily, since the content
is not being removed from the local annex. In this case, the combination of
the 2 bugs could result in an empty lock file being written, and then if
the download of the content failed, left in the object directory as the
content.

This commit also optimises lockContent, avoiding an unncessary
doesFileExist test and instead just catching the exception that's thrown
when the file doesn't exist.

This commit was sponsored by Justine Lam.
2014-08-20 17:25:30 -04:00
Joey Hess
ec7dd0446a more lock file refactoring 2014-08-20 17:03:04 -04:00