Commit graph

257 commits

Author SHA1 Message Date
Joey Hess
bfb4095c13
Improve behavior when a just added http remote is not available during uuid probe. Do not mark it as annex-ignore, so it will be tried again later. 2016-05-03 12:53:42 -04:00
Joey Hess
850d0da699
Fix duplicate progress meter display when downloading from a git remote over http with -J. 2016-04-19 13:10:56 -04:00
Joey Hess
2d7e46ea98
fix drop hang reported by musicmatze
Fix hang when dropping content needs to lock the content on a ssh remote,
which occurred when the remote has git-annex version 5.20151019 or newer.

Analysis: `race` runs 2 threads at once, and the hGetLine finishes first.
So, it tries to cancel the waitForProcess, but unfortunately that is making
a foreign call and so cannot be canceled. The remote git-annex-shell
is waiting for a line on stdin before it will exit. Deadlock.

This only occurred sometimes; I reproduced it going from darkstar to
elephant, but not from darkstar to darkstar. Not sure how that fits into
the above analysis -- perhaps a race condition is also involved?

Fixed by not using `race`; now the hGetLine will fail with an exception
if the remote git-annex-shell exits without any output.
2016-04-18 14:04:50 -04:00
Joey Hess
cf06dac2b8
hard links on windows
* annex.thin and annex.hardlink are now supported on Windows.
* unannex --fast now makes hard links on Windows.
2016-04-08 15:25:32 -04:00
Joey Hess
737e45156e
remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Joey Hess
ecd0684bfc
avoid hard linking object from other repository when annex.thin is set
This is simpler and less expensive than checking if the src file has a
link count >= 2, and also is unlocked.
2016-01-13 14:19:31 -04:00
Joey Hess
2513c1dfd0
remove reundant isDirect check
Already checked in wantHardLink
2016-01-13 14:13:37 -04:00
Joey Hess
d0da52f1b1
typo 2015-12-26 15:11:32 -04:00
Joey Hess
1b55af4c3c
deal with unlocked files when calling rsyncParamsRemote
In copyFromRemote, it used to check isDirect, but that was not needed;
the remote is sending the file, so it doesn't matter if the local,
receiving repository is in direct mode or not. And, since the content is not
present, yet, it's certianly not unlocked. Note that, the remote may indeed
be sending an unlocked file, but sendkey uses sendAnnex, which will detect
if the file is modified before or during transfer, and will exit nonzero,
aborting the upload. So, the receiver doesn't need any checks.

In copyToRemote, it forces recvkey to verify content whenever it's being
sent from a v6 repository. recvkey is almost always going to verify content
anyway, unless annex.verify is not set. So, this doesn't make it any more
expensive, except for in that unusual configuration. The alternative would
be to change the recvkey interface, so that the sender checks afterwards if
what it was sending changed, and the receiver then throws out the bad
transfer. That would be less expensive for the reciever, as it would not
need to do a checksum verification. But, it would mean another network
round trip, and since rsync closes the connection, it would need to open
another ssh connection to do this. Even with connction caching, that would
add latency to uploads. It would also complicate the interface, especially
because an older git-annex-shell would not have the new interface
available. For these reasons, I prefer punting on that at this time, and
instead someone might set annex.verify=false and be unhappy that it still
verifies..

(One other gotcha not dealt with is that a v5 repo could be upgraded to v6
while an upload is in progress, and a file unlocked and modified.)

(Also, I double-checked Remote.GCrypt's calls to rsyncParamsRemote, and
they're fine. When a file is being uploaded to gcrypt, or any other special
repository, it is mediated by sendAnnex, so changes will be detected at
that level and the special remote implementation doesn't need to worry
about them.)
2015-12-26 14:16:27 -04:00
Joey Hess
2b8f6b8b2f
check inode cache in prepSendAnnex
This does mean one query of the database every time an object is sent.
May impact performance.
2015-12-10 14:50:52 -04:00
Joey Hess
e97fce35a6
Display progress meter in -J mode when downloading from the web.
Including in addurl, and get --from web, but also in S3 and External
special remotes when a web url is known for content in those remotes.
2015-11-16 21:00:54 -04:00
Joey Hess
1244eb3770
refactor 2015-11-16 20:27:01 -04:00
Joey Hess
7943442dff
Display progress meter in -J mode when copying from a local git repo, to a local git repo, and from a remote git repo.
Had everything available, just didn't combine the progress meter with the
other places progress is sent to update it. (And to a remote repo already
did show progress.)

Most special remotes should already display progress meters with -J,
same as without it. One exception to this is the web, since it relies on
wget/curl progress display without -J. Still todo..
2015-11-16 19:32:30 -04:00
Joey Hess
4fd03ccd7b
concurrent-output, first pass
Output without -Jn should be unchanged from before. With -Jn,
concurrent-output is used for messages, but regions are not used yet, so
it's a mess.
2015-11-04 13:45:34 -04:00
Joey Hess
806819be57
Avoid displaying network transport warning when a ssh remote does not yet have an annex.uuid set.
Instead, only display transport error if the configlist output doesn't
include an annex.uuid line, even an empty one.

A recent change made git-annex init try to get all the remote uuids, and so
the transport error would be displayed by it. It was also displayed when
eg, copying files to a remote that had no uuid yet.
2015-10-15 15:36:54 -04:00
Joey Hess
b0e5c09408
fix various build warnings, mostly on Windows
And some when S3 is disabled
2015-10-13 13:24:44 -04:00
Joey Hess
2154b7a38f
add inAnnex check to local lockKey 2015-10-09 18:00:37 -04:00
Joey Hess
6145f905e0
improve display when lockcontent fails
/dev/null stderr; ssh is still able to display a password prompt
despite this

Show some messages so the user knows it's locking a remote, and
knows if that locking failed.
2015-10-09 17:31:02 -04:00
Joey Hess
3b89d5a20c
implement lockContent for ssh remotes 2015-10-09 16:55:41 -04:00
Joey Hess
6a72045707
fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal 2015-10-09 15:48:02 -04:00
Joey Hess
865dd11dbf
fix lockKey to run callback in original Annex monad, not local remote's 2015-10-09 13:35:28 -04:00
Joey Hess
4c6095b6f5
content locking during drop working for local git remotes
Only ssh remotes lack locking now
2015-10-09 13:12:58 -04:00
Joey Hess
b1abe59193
add removeKey action to Remote
Not implemented for any remotes yet; probably the git remote is the only
one that will ever implement it.
2015-10-08 15:01:38 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
2def1d0a23 other 80% of avoding verification when hard linking to objects in shared repo
In c6632ee5c8, it actually only handled
uploading objects to a shared repository. To avoid verification when
downloading objects from a shared repository, was a lot harder.

On the plus side, if the process of downloading a file from a remote
is able to verify its content on the side, the remote can indicate this
now, and avoid the extra post-download verification.

As of yet, I don't have any remotes (except Git) using this ability.
Some more work would be needed to support it in special remotes.

It would make sense for tahoe to implicitly verify things downloaded from it;
as long as you trust your tahoe server (which typically runs locally),
there's cryptographic integrity. OTOH, despite bup being based on shas,
a bup repo under an attacker's control could have the git ref used for an
object changed, and so a bup repo shouldn't implicitly verify. Indeed,
tahoe seems unique in being trustworthy enough to implicitly verify.
2015-10-02 14:35:12 -04:00
Joey Hess
c6632ee5c8 avoid verification when hard linking to objects in shared repository
Such a repository is implicitly trusted, so there's no point.
2015-10-02 12:36:03 -04:00
Joey Hess
2fb3722ce9 Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 15:56:39 -04:00
Joey Hess
807ba6a903 refactor 2015-10-01 14:07:06 -04:00
Joey Hess
ffa8221517 annex.hardlink extended to also try to use hard links when copying from the repository to a remote.
Also, it used to only check that one of the repos was not in direct mode;
now when either repo is direct mode, annex.hardlink won't have an effect.
2015-09-14 12:13:38 -04:00
Joey Hess
127c3db162 add some debugs to get timings
Note that I had one in Annex.Action.startup too, but it resulted in a weird
message printed by ssh, "channel 2: bad ext data". I don't know why, but
it only happened when transferinfo was run, so I wonder
if 983a95f021 introduced a fragility somehow.
2015-08-13 16:13:16 -04:00
Joey Hess
983a95f021 Sped up downloads of files from ssh remotes, reducing the non-data-transfer overhead 6x. 2015-08-13 14:20:28 -04:00
Joey Hess
f99ae3d713 remove debug print 2015-08-13 13:18:47 -04:00
Joey Hess
c5b8484c2e Simplify setup process for a ssh remote.
Now it suffices to run git remote add, followed by git-annex sync. Now the
remote is automatically initialized for use by git-annex, where before the
git-annex branch had to manually be pushed before using git-annex sync.
Note that this involved changes to git-annex-shell, so if the remote is
using an old version, the manual push is still needed.

Implementation required git-annex-shell be changed, so configlist can
autoinit a repository even when no git-annex branch has been pushed yet.
Unfortunate because we'll have to wait for it to get deployed to servers
before being able to rely on this change in the documentation.

Did consider making git-annex sync push the git-annex branch to repos that
didn't have a uuid, but this seemed difficult to do without complicating it
in messy ways.

It would be cleaner to split a command out from configlist to handle
the initialization. But this is difficult without sacrificing backwards
compatability, for users of old git-annex versions which would not use the
new command.
2015-08-05 13:49:58 -04:00
Joey Hess
61ccf95004 Avoid accumulating transfer failure log files unless the assistant is being used.
Only the assistant uses these, and only the assistant cleans them up, so
make only git annex transferkeys write them,

There is one behavior change from this. If glacier is being used, and a
manual git annex get --from glacier fails because the file isn't available
yet, the assistant will no longer later see that failed transfer file and
retry the get. Hope no-one depended on that old behavior.
2015-05-12 15:53:38 -04:00
Joey Hess
e27b97d364 Merge branch 'master' into concurrentprogress
Conflicts:
	Command/Fsck.hs
	Messages.hs
	Remote/Directory.hs
	Remote/Git.hs
	Remote/Helper/Special.hs
	Types/Remote.hs
	debian/changelog
	git-annex.cabal
2015-05-12 13:23:22 -04:00
Joey Hess
addc82dab7 removed all uses of undefined from code base
It's a code smell, can lead to hard to diagnose error messages.
2015-04-19 00:38:29 -04:00
Joey Hess
0def1f0b53 Fix fsck --from a git remote in a local directory, and from a directory special remote. This was a reversion caused by the relative path changes in 5.20150113.
The directory special remote was not affected in its normal configuration,
since annex-directory is an absolute path normally. But it could fail
when a relative path was used.

The git remote was affected even when an absolute path to it was used in
.git/config, since git-annex now converts all such paths to relative.
2015-04-18 13:36:12 -04:00
Joey Hess
a2902cdaaf add filename to progress bar, and display ok/failed at end
This needed plumbing an AssociatedFile through retrieveKeyFileCheap.
2015-04-14 16:35:10 -04:00
Joey Hess
dc4de7faf7 add missing progress bar 2015-04-14 16:00:20 -04:00
Joey Hess
75b6b5cbc7 only display built-in meters in parallel mode 2015-04-10 15:20:23 -04:00
Joey Hess
f8e700ed06 use built-in progress meters for git when in parallel mode 2015-04-10 15:15:21 -04:00
Joey Hess
e87f3b40eb propigate outer output state into inner state when running onLocal
Otherwise, progress displays would not be suppressed here when running with
--quiet. Interesting wrinkle!
2015-04-03 20:08:38 -04:00
Joey Hess
450ee53ab6 When re-execing git-annex, use current program location, rather than ~/.config/git-annex/program, when possible.
Most of the time, there will be no discreprancy between programPath and
readProgramFile.

But, the programFile might have been written by an old version of git-annex
that is still installed, while a newer one is currently running. In this
case, we want to run the same one that's currently running.

This is especially important for things like the GIT_SSH=git-annex used for
ssh connection caching.

The only code that still uses readProgramFile directly is the upgrade code,
which needs to know where the standalone git-annex was installed, in order to
upgrade it.
2015-02-28 17:23:13 -04:00
Joey Hess
a22eaaae27 comment 2015-02-09 14:16:42 -04:00
Joey Hess
009bd050c1 implement annex.tune.objecthashlower
Split out Annex.DirHashes which never really belonged in Locations.
2015-01-28 16:52:08 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
4f657aa14e add getFileSize, which can get the real size of a large file on Windows
Avoid using fileSize which maxes out at just 2 gb on Windows.
Instead, use hFileSize, which doesn't have a bounded size.
Fixes support for files > 2 gb on Windows.

Note that the InodeCache code only needs to compare a file size,
so it doesn't matter it the file size wraps. So it has been
left as-is. This was necessary both to avoid invalidating existing inode
caches, and because the code passed FileStatus around and would have become
more expensive if it called getFileSize.

This commit was sponsored by Christian Dietrich.
2015-01-20 17:09:24 -04:00
Joey Hess
534c29deae implemented old Richih wishlist about remote/uuid info
* info: Can now display info about a given uuid.
  * Added to remote/uuid info: Count of the number of keys present
    on the remote, and their size. This is rather expensive to calculate,
    so comes last and --fast will disable it.
  * Git remote info now includes the date of the last sync with the remote.
2015-01-13 18:13:14 -04:00
Joey Hess
3bab5dfb1d revert parentDir change
Reverts 965e106f24

Unfortunately, this caused breakage on Windows, and possibly elsewhere,
because parentDir and takeDirectory do not behave the same when there is a
trailing directory separator.
2015-01-09 13:11:56 -04:00
Joey Hess
965e106f24 made parentDir return a Maybe FilePath; removed most uses of it
parentDir is less safe than takeDirectory, especially when working
with relative FilePaths. It's really only useful in loops that
want to terminate at /

This commit was sponsored by Audric SCHILTKNECHT.
2015-01-06 18:55:56 -04:00