Commit graph

792 commits

Author SHA1 Message Date
Joey Hess
ecd0684bfc
avoid hard linking object from other repository when annex.thin is set
This is simpler and less expensive than checking if the src file has a
link count >= 2, and also is unlocked.
2016-01-13 14:19:31 -04:00
Joey Hess
2513c1dfd0
remove reundant isDirect check
Already checked in wantHardLink
2016-01-13 14:13:37 -04:00
Joey Hess
d0da52f1b1
typo 2015-12-26 15:11:32 -04:00
Joey Hess
1b55af4c3c
deal with unlocked files when calling rsyncParamsRemote
In copyFromRemote, it used to check isDirect, but that was not needed;
the remote is sending the file, so it doesn't matter if the local,
receiving repository is in direct mode or not. And, since the content is not
present, yet, it's certianly not unlocked. Note that, the remote may indeed
be sending an unlocked file, but sendkey uses sendAnnex, which will detect
if the file is modified before or during transfer, and will exit nonzero,
aborting the upload. So, the receiver doesn't need any checks.

In copyToRemote, it forces recvkey to verify content whenever it's being
sent from a v6 repository. recvkey is almost always going to verify content
anyway, unless annex.verify is not set. So, this doesn't make it any more
expensive, except for in that unusual configuration. The alternative would
be to change the recvkey interface, so that the sender checks afterwards if
what it was sending changed, and the receiver then throws out the bad
transfer. That would be less expensive for the reciever, as it would not
need to do a checksum verification. But, it would mean another network
round trip, and since rsync closes the connection, it would need to open
another ssh connection to do this. Even with connction caching, that would
add latency to uploads. It would also complicate the interface, especially
because an older git-annex-shell would not have the new interface
available. For these reasons, I prefer punting on that at this time, and
instead someone might set annex.verify=false and be unhappy that it still
verifies..

(One other gotcha not dealt with is that a v5 repo could be upgraded to v6
while an upload is in progress, and a file unlocked and modified.)

(Also, I double-checked Remote.GCrypt's calls to rsyncParamsRemote, and
they're fine. When a file is being uploaded to gcrypt, or any other special
repository, it is mediated by sendAnnex, so changes will be detected at
that level and the special remote implementation doesn't need to worry
about them.)
2015-12-26 14:16:27 -04:00
Joey Hess
f776ac0a11
add unlocked flag for git-annex-shell recvkey
The direct flag is also set when sending unlocked content, to support old
versions of git-annex-shell. At some point, the direct flag will be
removed, and only the unlocked flag will be used.
2015-12-26 13:59:27 -04:00
Joey Hess
c608a752a5
Merge branch 'master' into smudge 2015-12-11 13:50:31 -04:00
Joey Hess
0f126440ca
webdav: When testing the WebDAV server, send a file with content. The empty file it was sending tickled bugs in some php WebDAV server. 2015-12-11 12:13:20 -04:00
Joey Hess
2b8f6b8b2f
check inode cache in prepSendAnnex
This does mean one query of the database every time an object is sent.
May impact performance.
2015-12-10 14:50:52 -04:00
Joey Hess
f7d63a0117
tahoe: Include tahoe capabilities in whereis display. 2015-11-30 15:35:53 -04:00
Joey Hess
a9a10ee0a9
improve error message when special remote program cannot be run 2015-11-18 12:30:01 -04:00
Joey Hess
e97fce35a6
Display progress meter in -J mode when downloading from the web.
Including in addurl, and get --from web, but also in S3 and External
special remotes when a web url is known for content in those remotes.
2015-11-16 21:00:54 -04:00
Joey Hess
1244eb3770
refactor 2015-11-16 20:27:01 -04:00
Joey Hess
7943442dff
Display progress meter in -J mode when copying from a local git repo, to a local git repo, and from a remote git repo.
Had everything available, just didn't combine the progress meter with the
other places progress is sent to update it. (And to a remote repo already
did show progress.)

Most special remotes should already display progress meters with -J,
same as without it. One exception to this is the web, since it relies on
wget/curl progress display without -J. Still todo..
2015-11-16 19:32:30 -04:00
Joey Hess
aaf1ef268d
convert from Utility.LockPool to Annex.LockPool everywhere 2015-11-12 18:13:37 -04:00
Joey Hess
4fd03ccd7b
concurrent-output, first pass
Output without -Jn should be unchanged from before. With -Jn,
concurrent-output is used for messages, but regions are not used yet, so
it's a mess.
2015-11-04 13:45:34 -04:00
Joey Hess
4153507864
Fix failure to build with aws-0.13.0 and finish nearline support.
* Fix failure to build with aws-0.13.0.
* When built with aws-0.13.0, the S3 special remote can be used to create
  google nearline buckets, by setting storageclass=NEARLINE.
2015-11-02 11:14:03 -04:00
Joey Hess
806819be57
Avoid displaying network transport warning when a ssh remote does not yet have an annex.uuid set.
Instead, only display transport error if the configlist output doesn't
include an annex.uuid line, even an empty one.

A recent change made git-annex init try to get all the remote uuids, and so
the transport error would be displayed by it. It was also displayed when
eg, copying files to a remote that had no uuid yet.
2015-10-15 15:36:54 -04:00
Joey Hess
c32a2429ed
S3: Fix support for using https.
Was using the http-only Manager before, not the tls-capable one.
2015-10-15 10:37:06 -04:00
Joey Hess
b0e5c09408
fix various build warnings, mostly on Windows
And some when S3 is disabled
2015-10-13 13:24:44 -04:00
Joey Hess
fa9333e99f
use action, not sideAction
sideAction is for things not generally related to the current action being
performed. And, it adds a newline after the side action. This was not the
right thing to use for stuff like "checksum", where doing a checksum is
part of the git annex get process, and indeed we want it to display
"(checksum...) ok"
2015-10-11 13:29:44 -04:00
Joey Hess
2154b7a38f
add inAnnex check to local lockKey 2015-10-09 18:00:37 -04:00
Joey Hess
6145f905e0
improve display when lockcontent fails
/dev/null stderr; ssh is still able to display a password prompt
despite this

Show some messages so the user knows it's locking a remote, and
knows if that locking failed.
2015-10-09 17:31:02 -04:00
Joey Hess
3b89d5a20c
implement lockContent for ssh remotes 2015-10-09 16:55:41 -04:00
Joey Hess
6a72045707
fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal 2015-10-09 15:48:02 -04:00
Joey Hess
865dd11dbf
fix lockKey to run callback in original Annex monad, not local remote's 2015-10-09 13:35:28 -04:00
Joey Hess
4c6095b6f5
content locking during drop working for local git remotes
Only ssh remotes lack locking now
2015-10-09 13:12:58 -04:00
Joey Hess
b1abe59193
add removeKey action to Remote
Not implemented for any remotes yet; probably the git remote is the only
one that will ever implement it.
2015-10-08 15:01:38 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
2def1d0a23 other 80% of avoding verification when hard linking to objects in shared repo
In c6632ee5c8, it actually only handled
uploading objects to a shared repository. To avoid verification when
downloading objects from a shared repository, was a lot harder.

On the plus side, if the process of downloading a file from a remote
is able to verify its content on the side, the remote can indicate this
now, and avoid the extra post-download verification.

As of yet, I don't have any remotes (except Git) using this ability.
Some more work would be needed to support it in special remotes.

It would make sense for tahoe to implicitly verify things downloaded from it;
as long as you trust your tahoe server (which typically runs locally),
there's cryptographic integrity. OTOH, despite bup being based on shas,
a bup repo under an attacker's control could have the git ref used for an
object changed, and so a bup repo shouldn't implicitly verify. Indeed,
tahoe seems unique in being trustworthy enough to implicitly verify.
2015-10-02 14:35:12 -04:00
Joey Hess
c6632ee5c8 avoid verification when hard linking to objects in shared repository
Such a repository is implicitly trusted, so there's no point.
2015-10-02 12:36:03 -04:00
Joey Hess
2fb3722ce9 Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 15:56:39 -04:00
Joey Hess
807ba6a903 refactor 2015-10-01 14:07:06 -04:00
Joey Hess
9e3ac97608 avoid deprecation warnings when built with http-client >= 0.4.18
Since I want git-annex to keep building on debian stable, I need to still
support the old http-client, which required explicit calls to
closeManager, or use of withManager to get Managers to close at appropriate
times. This is not needed in the new version, and so they added a
deprecation warning. IMHO much too early, because look at the mess I had to
go through to avoid that deprecation warning while supporting both
versions..
2015-10-01 13:48:56 -04:00
Joey Hess
20205b6073 avoid hard dependency on new version of aws 2015-09-22 11:04:26 -04:00
Joey Hess
26d6566307 S3 storage classes expansion
Added support for storageclass=STANDARD_IA to use Amazon's
new Infrequently Accessed storage.

Also allows using storageclass=NEARLINE to use Google's NearLine storage.

The necessary changes to aws to support this are in
https://github.com/aristidb/aws/pull/176
2015-09-17 17:20:01 -04:00
Joey Hess
ffa8221517 annex.hardlink extended to also try to use hard links when copying from the repository to a remote.
Also, it used to only check that one of the repos was not in direct mode;
now when either repo is direct mode, annex.hardlink won't have an effect.
2015-09-14 12:13:38 -04:00
Joey Hess
0390efae8c support gpg.program
When gpg.program is configured, it's used to get the command to run for
gpg. Useful on systems that have only a gpg2 command or want to use it
instead of the gpg command.
2015-09-09 18:06:49 -04:00
Joey Hess
6dad09a823 disable whereisKey for encrypted or chunked remotes
This only makes sense for public repos, that are not chunked, so
that there's a 1:1 from Key in the git-annex repo to file on the remote.
Rather than making every remote implementation deal with that, just disable
whereisKey when it doesn't make sense.
2015-08-19 14:16:01 -04:00
Joey Hess
858104078a make whereis show urls when web remote does not have content
This is needed when external special remotes register an url for a key.
2015-08-17 11:35:34 -04:00
Joey Hess
3a5b7dbaf0 External special remotes can now be built that can be used in readonly mode, where git-annex downloads content from the remote using regular http.
Note that, if an url is added to the web log for such a remote, it's not
distinguishable from another url that might be added for the web remote.
(Because the web log doesn't distinguish which remote owns a plain url.
Urls with a downloader set are distinguishable, but we're not using them
here.)

This seems ok-ish.. In such a case, both remotes will try to use both
urls, and both remotes should be able to.

The only issue I see is that dropping a file from the web remote will
remove both urls in this case. This is not often done, and could even
be considered a feature, I suppose.
2015-08-17 11:22:22 -04:00
Joey Hess
99b9a3f277 export some always failing methods for readonly remotes 2015-08-17 11:21:38 -04:00
Joey Hess
fb9d851258 refactor 2015-08-17 11:21:13 -04:00
Joey Hess
1cd3b7ddf0 refactor 2015-08-17 10:42:14 -04:00
Joey Hess
6bc46e384e Added WHEREIS to external special remote protocol. 2015-08-13 17:27:50 -04:00
Joey Hess
127c3db162 add some debugs to get timings
Note that I had one in Annex.Action.startup too, but it resulted in a weird
message printed by ssh, "channel 2: bad ext data". I don't know why, but
it only happened when transferinfo was run, so I wonder
if 983a95f021 introduced a fragility somehow.
2015-08-13 16:13:16 -04:00
Joey Hess
43aa881b47 --debug is passed along to git-annex-shell when git-annex is in debug mode. 2015-08-13 15:05:39 -04:00
Joey Hess
983a95f021 Sped up downloads of files from ssh remotes, reducing the non-data-transfer overhead 6x. 2015-08-13 14:20:28 -04:00
Joey Hess
f99ae3d713 remove debug print 2015-08-13 13:18:47 -04:00
Joey Hess
0ec9bc2200 Added support for SHA3 hashed keys (in 8 varieties), when git-annex is built using the cryptonite library.
While cryptohash has SHA3 support, it has not been updated for the final
version of the spec. Note that cryptonite has not been ported to all arches
that cryptohash builds on yet.
2015-08-06 15:02:25 -04:00
Joey Hess
c5b8484c2e Simplify setup process for a ssh remote.
Now it suffices to run git remote add, followed by git-annex sync. Now the
remote is automatically initialized for use by git-annex, where before the
git-annex branch had to manually be pushed before using git-annex sync.
Note that this involved changes to git-annex-shell, so if the remote is
using an old version, the manual push is still needed.

Implementation required git-annex-shell be changed, so configlist can
autoinit a repository even when no git-annex branch has been pushed yet.
Unfortunate because we'll have to wait for it to get deployed to servers
before being able to rely on this change in the documentation.

Did consider making git-annex sync push the git-annex branch to repos that
didn't have a uuid, but this seemed difficult to do without complicating it
in messy ways.

It would be cleaner to split a command out from configlist to handle
the initialization. But this is difficult without sacrificing backwards
compatability, for users of old git-annex versions which would not use the
new command.
2015-08-05 13:49:58 -04:00