Commit graph

2052 commits

Author SHA1 Message Date
Joey Hess
c34152777b
Use http-conduit for url downloads by default, annex.web-options enables curl
* For url downloads, git-annex now defaults to using a http library,
  rather than wget or curl. But, if annex.web-options is set, it will
  use curl. To use the .netrc file, run:
    git config annex.web-options --netrc
* git-annex no longer uses wget (and wget is no longer shipped with
  git-annex builds).

Note that curl is always run in silent mode, since the new API for
download has a MeterUpdate and doesn't make way for curl progress
output. It might be worth writing a parser for curl's progress output
to update the meter when using it, but I didn't bother with this edge
case for now.

This commit was supported by the NSF-funded DataLad project.
2018-04-06 17:36:20 -04:00
Joey Hess
0f6775f1ff
refactor sinkResponseFile and add downloadC
Remote.S3 and Remote.Helper.Http both had similar code to sink a
http-conduit Response to a file; refactor out sinkResponseFile.

downloadC downloads an url to a file using http-conduit, and supports
resuming. Falls back to curl to handle urls that http-conduit does not
support. This is not used yet, but the goal is to replace download with
it.

git-annex.cabal: conduit-extra was not actually used for a long time,
remove the dep. conduit moves into the main dependency list, but since
http-conduit was already in there, and it depends on conduit, that's not
really adding a new build dep.

This commit was supported by the NSF-funded DataLad project.
2018-04-06 16:07:08 -04:00
Joey Hess
9b98d3f630
better HTTP connection reuse
Enable HTTP connection reuse across multiple files, when git-annex
uses http-conduit. Before, a new Manager was created each time
Utility.Url used it. Now, a single Manager gets created the first time,
so connections are reused.

Doesn't help when external programs are used for url download,
but does speed up addurl --fast, fsck --from web, etc.

Testing fsck --fast --from web with 3 files, over high-latency
satellite internet, it sped up from 19.37s to 14.96s.

This commit was supported by the NSF-funded DataLad project.
2018-04-04 15:39:40 -04:00
Joey Hess
0783352fae
todo 2018-04-04 14:32:32 -04:00
Joey Hess
ae75eb06bc
exporttree support for adb special remote
This commit was sponsored by Michael Magin.
2018-03-27 16:28:41 -04:00
Joey Hess
2927618d35
Added adb special remote which allows exporting files to Android devices.
git annex testremote passes.

exportree not implemented yet, although the documentation talks about it,
since it will be the main way this remote will be used.

The adb push/pull progress is displayed for now; it would be better
to consume it and use it to update the git-annex progress bar.

This commit was sponsored by andrea rota.
2018-03-27 14:54:41 -04:00
Joey Hess
abffea5fcb
update 2018-03-21 09:19:06 -04:00
Joey Hess
1cf705d416
idea 2018-03-21 03:19:47 -04:00
Joey Hess
0ae2662f71
ideas 2018-03-21 02:25:53 -04:00
vrs+annex@ea5fa24dbb279be61a8e50adb638bf8366300717
ae462950a9 Added a comment: related bugs 2018-03-17 19:06:46 +00:00
Joey Hess
050ada746f
Added backends for the BLAKE2 family of hashes.
There are a lot of different variants and sizes, I suppose we might as well
export all the common ones.

Bump dep to cryptonite to 0.16, earlier versions lacked BLAKE2 support.
Even android has 0.16 or newer.

On Debian, Blake2bp_512 is buggy, so I have omitted it for now.
http://bugs.debian.org/892855

This commit was sponsored by andrea rota.
2018-03-13 16:23:42 -04:00
Joey Hess
0ceb6da0fa
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-13 15:08:28 -04:00
Joey Hess
4015c5679a
force verification when resuming download
When resuming a download and not using a rolling checksummer like rsync,
the partial file we start with might contain garbage, in the case where a
file changed as it was being downloaded. So, disabling verification on
resumes risked a bad object being put into the annex.

Even downloads with rsync are currently affected. It didn't seem worth the
added complexity to special case those to prevent verification, especially
since git-annex is using rsync less often now.

This commit was sponsored by Brock Spratlen on Patreon.
2018-03-13 14:50:49 -04:00
Joey Hess
31e1adc005
deal with unlocked files
P2P protocol version 1 adds VALID|INVALID after DATA; INVALID means the
file was detected to change content while it was being sent and so we
may not have received the valid content of the file.

Added new MustVerify constructor for Verification, which forces
verification even when annex.verify=false etc. This is used when INVALID
and in protocol version 0.

As well as changing git-annex-shell p2psdio, this makes git-annex tor
remotes always force verification, since they don't yet use protocol
version 1. Previously, annex.verify=false could skip verification when
using tor remotes, and let bad data into the repository.

This commit was sponsored by Jack Hill on Patreon.
2018-03-13 14:27:14 -04:00
https://openid.stackexchange.com/user/3ee5cf54-f022-4a71-8666-3c2b5ee231dd
7ea75e1132 Added a comment: How expensive is verification anyway? 2018-03-13 17:58:19 +00:00
Joey Hess
9930b1f140
a plan 2018-03-13 12:17:24 -04:00
Joey Hess
abe8346dca
response 2018-03-12 17:49:06 -04:00
Joey Hess
ad13b56c86
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-12 17:33:37 -04:00
Joey Hess
59e7f3cbb2
done for the day 2018-03-12 17:32:57 -04:00
Joey Hess
24f35f6acc
remove todo about assistant and network change
When the assistant detects a network change, it
stops using old git-annex transferkeys processes.
So, no problem that old git-annex-shell p2pstdio
connections are cached; they won't be reused after
network change.
2018-03-12 17:24:40 -04:00
Joey Hess
b654597c2f
urk, this is hard 2018-03-12 17:24:18 -04:00
Joey Hess
c3df5d1f10
avoid double-connect to unreachable ssh remote
When git-annex-shell p2pstdio fails with 255, it's because the ssh
server is not reachable. Avoid running the fallback action in this case,
since it would just try a second time to connect, and presumably fail.

Note that the closed P2PSshConnection will not be stored in the pool,
so the next request tries again to connect. This is just the right
behavior; when the remote becomes reachable again, the same git-annex
process will start using it.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-03-12 16:50:21 -04:00
Joey Hess
e36ceb7448
open todo for p2p protocol flag days 2018-03-12 16:28:54 -04:00
davicastro
56bab023b9 Added a comment: What are the drawbacks of repo v6? 2018-03-12 19:00:36 +00:00
Joey Hess
c81768d425
version the P2P protocol
Unfortunately ReceiveMessage didn't handle unknown messages the way it
was documented to; client sending VERSION would cause the server to
return an ERROR and hang up. Fixed that, but old releases of git-annex
use the P2P protocol for tor and will still have that behavior.

So, version is not negotiated for Remote.P2P connections, only for
Remote.Git connections, which will support VERSION from their first
release. There will need to be a later flag day to change Remote.P2P;
left a commented out line that is the only thing that will need to be
changed then.

Version 1 of the P2P protocol is not implemented yet, but updated
the docs for the DATA change that will be allowed by that version.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2018-03-12 14:36:35 -04:00
Joey Hess
f74c970c7e
updates 2018-03-09 13:55:25 -04:00
Joey Hess
08814327ff
use P2P protocol for checkpresent, retrieve, and store
Note that, due to not using rsync to transfer files to ssh remotes
any longer, permissions and other file metadata of annexed files
will no longer be preserved when copying them to ssh remotes.
Other remotes never supported preserving that information, so
this is not considered a regression. Added NEWS item about this.

Another significant side effect of this is that, even when rsync is run to
retrieve a file, its progress display will no longer be shown, and
instead the native git-annex progress display will appear. It would be
possible to use the rsync process display when rsync is used (old
git-annex-shell and also retrieval from a local repository), but it
would have complicated the code unncessarily, and been inconsistent
behavior.

(I'd been thinking for a while about eliminating the rsync progress
display, since it's got some annoying verbosities, including display of
the key and the "(xfr#1, to-chk=0/1)" bit and was already somewhat
inconsistent.)

retrieveKeyFileCheap still uses rsync, since that ensures that it gets
the actual file content from the remote. Using the P2P protocol would
use the local content, as long as the local and remote size are the
same.

This commit was sponsored by John Pellman on Patreon.
2018-03-09 13:25:16 -04:00
Joey Hess
26febb4e58
benchmarks 2018-03-09 12:57:23 -04:00
Joey Hess
6a59bc4845
use P2P protocol for drop
Not yet used for everything else, but this is enough to
verify that it works, and do some benchmarking.

Some bugfixes included, which got it working. Also fallback to old
actions has been verified to work correctly.

Benchmarked dropping one thousand files from a ssh remote on localhost.
Using the old git-annex	40.867 seconds.
With the P2P protocol	9.905 seconds!

This commit was sponsored by Jochen Bartl on Patreon.
2018-03-08 16:56:17 -04:00
Joey Hess
0a294900b5
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-07 15:41:33 -04:00
Joey Hess
6ddfa9807b
implemented git-annex-shell p2pstdio
Not yet used by git-annex, but this will allow faster transfers etc than
using individual ssh connections and rsync.

Not called git-annex-shell p2p, because git-annex p2p does something
else and I don't want two subcommands with the same name between the two
for sanity reasons.

This commit was sponsored by Øyvind Andersen Holm.
2018-03-07 15:38:01 -04:00
yarikoptic
6023d82a43 Added a comment 2018-03-06 20:33:44 +00:00
Joey Hess
9b8f6c9cb9
followup 2018-03-06 14:51:27 -04:00
Joey Hess
f42baedd4c
designing new git-annex-shell multi
This commit was supported by the NSF-funded DataLad project.
2018-03-06 14:48:44 -04:00
Joey Hess
bf98afb91a
comment 2018-03-06 13:38:55 -04:00
Joey Hess
a4bd508643
todo 2018-02-28 16:52:13 -04:00
Joey Hess
bed6773346
Support exporttree=yes for rsync special remotes.
Renaming is not supported; it might be possible to use --fuzzy to get rsync
to notice the file is being renamed, but that is a bit ..fuzzy.

On the other hand, interrupted transfers of an exported file are resumed,
since rsync is great at that. Had to adjust the exporttree docs, which
said interrupted transfers would restart.

Note that remove no longer makes the empty directory dummy, instead
sending the top-level empty directory. This works just as well and I
noticed the dummy was unncessary when refactoring it into removeGeneric.
Verified that behavior of remove is not changed, and git annex
testremote does pass.

This commit was sponsored by Brock Spratlen on Patreon.
2018-02-28 13:36:20 -04:00
roger.herikstad@ca3b99b0263344ccfd8ec134a12261be25ef3504
7a4fe7e80c Added a comment: Not working with rsync? 2018-02-28 02:41:38 +00:00
Joey Hess
8104f2f758
done 2018-02-20 13:00:51 -04:00
Joey Hess
572754c925
update 2018-02-19 15:54:56 -04:00
yarikoptic
fb2c7d180a Added a comment 2018-02-16 18:42:28 +00:00
Joey Hess
b0633931d6
comment 2018-02-16 14:05:26 -04:00
Joey Hess
56711c310c
update 2018-02-16 13:56:05 -04:00
Joey Hess
846208b265
comment 2018-02-16 13:47:10 -04:00
Joey Hess
1758982760
comment 2018-02-16 13:28:13 -04:00
yarikoptic
36375ca060 Added a comment 2018-02-08 18:06:55 +00:00
Joey Hess
5109d406c0
comment 2018-02-08 13:31:36 -04:00
Joey Hess
68321be178
comment 2018-02-08 13:12:39 -04:00
yarikoptic
8cd0f8b28e 2018-02-06 22:35:41 +00:00
yarikoptic
638032f3aa Added a comment 2018-02-06 20:15:35 +00:00