Commit graph

26 commits

Author SHA1 Message Date
Joey Hess
4c32499e82
Parse youtube-dl progress output
Which lets progress be displayed when doing concurrent downloads.
Amoung other things, like --json-progress etc.

The youtube-dl output is no longer displayed, except for any errors.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-29 17:53:48 -04:00
Joey Hess
438dbe3b66
convert to withCreateProcess for async exception safety
This handles all sites where checkSuccessProcess/ignoreFailureProcess
is used, except for one: Git.Command.pipeReadLazy
That one will be significantly more work to convert to bracketing.

(Also skipped Command.Assistant.autoStart, but it does not need to
shut down the processes it started on exception because they are
git-annex assistant daemons..)

forceSuccessProcess is done, except for createProcessSuccess.
All call sites of createProcessSuccess will need to be converted
to bracketing.

(process pools still todo also)
2020-06-04 12:44:09 -04:00
Joey Hess
890330f0fe
make --json-error-messages capture url download errors
Convert Utility.Url to return Either String so the error message can be
displated in the annex monad and so captured.

(When curl is used, its errors are still not caught.)
2019-11-12 13:52:38 -04:00
Joey Hess
1871295765
rename annex.security.allowed-http-addresses
Renamed annex.security.allowed-http-addresses to
annex.security.allowed-ip-addresses because it is not really specific to
the http protocol, also limiting eg, git-annex's use of ftp and via
youtube-dl, several other protocols.

The old name for the config will still work.

If both old and new name are set, the new name will win.
2019-05-30 12:43:40 -04:00
Joey Hess
40ecf58d4b
update licenses from GPL to AGPL
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.

Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.

(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
2019-03-13 15:48:14 -04:00
Joey Hess
84e71dae2e
comment typo 2018-12-30 15:51:20 -04:00
Joey Hess
ecdba3ed3f
When running youtube-dl to get a filename, pass --no-playlist
Seems that youtube-dl --get-filename on a playlist lists all the filenames
for the playlist, which can take quite some time. The code already only
took the first name, so --no-playlist can speed it up a lot.

This commit was sponsored by Brett Eisenberg on Patreon.
2018-11-28 17:14:47 -04:00
Joey Hess
813ee2357c
improve message 2018-09-02 16:08:00 -04:00
Joey Hess
a63bbd868b
make addurl of media url fail when youtube-dl is disabled
addurl: When security configuration prevents downloads with youtube-dl,
still check if the url is one that it supports, and fail downloading it,
instead of downloading the raw web page.
2018-06-28 13:01:18 -04:00
Joey Hess
e62c4543c3
default to not using youtube-dl, for security
Pity, but same reasoning as curl applies to it.

This commit was sponsored by Peter on Patreon.
2018-06-17 14:51:02 -04:00
Joey Hess
28720c795f
limit url downloads to whitelisted schemes
Security fix! Allowing any schemes, particularly file: and
possibly others like scp: allowed file exfiltration by anyone who had
write access to the git repository, since they could add an annexed file
using such an url, or using an url that redirected to such an url,
and wait for the victim to get it into their repository and send them a copy.

* Added annex.security.allowed-url-schemes setting, which defaults
  to only allowing http and https URLs. Note especially that file:/
  is no longer enabled by default.

* Removed annex.web-download-command, since its interface does not allow
  supporting annex.security.allowed-url-schemes across redirects.
  If you used this setting, you may want to instead use annex.web-options
  to pass options to curl.

With annex.web-download-command removed, nearly all url accesses in
git-annex are made via Utility.Url via http-client or curl. http-client
only supports http and https, so no problem there.
(Disabling one and not the other is not implemented.)

Used curl --proto to limit the allowed url schemes.

Note that this will cause git annex fsck --from web to mark files using
a disallowed url scheme as not being present in the web. That seems
acceptable; fsck --from web also does that when a web server is not available.

youtube-dl already disabled file: itself (probably for similar
reasons). The scheme check was also added to youtube-dl urls for
completeness, although that check won't catch any redirects it might
follow. But youtube-dl goes off and does its own thing with other
protocols anyway, so that's fine.

Special remotes that support other domain-specific url schemes are not
affected by this change. In the bittorrent remote, aria2c can still
download magnet: links. The download of the .torrent file is
otherwise now limited by annex.security.allowed-url-schemes.

This does not address any external special remotes that might download
an url themselves. Current thinking is all external special remotes will
need to be audited for this problem, although many of them will use
http libraries that only support http and not curl's menagarie.

The related problem of accessing private localhost and LAN urls is not
addressed by this commit.

This commit was sponsored by Brett Eisenberg on Patreon.
2018-06-16 11:57:50 -04:00
Joey Hess
2ec07bc29f
Avoid running annex.http-headers-command more than once. 2018-04-04 15:15:08 -04:00
Joey Hess
25703e1413
finally really add back custom-setup stanza
Fourth or fifth try at this and finally found a way to make it work.

Absurd amount of busy-work forced on me by change in cabal's behavior.
Split up Utility modules that need posix stuff out of ones used by
Setup. Various other hacks around inability for Setup to use anything
that ifdefs a use of unix.

Probably lost a full day of my life to this.
This is how build systems make their users hate them. Just saying.
2017-12-31 16:36:39 -04:00
Joey Hess
2bfdd690e2
addurl: Fix encoding of filename queried from youtube-dl when in --fast mode.
And also now in non-fast mode, since it was just changed to query for the
filename separately.

And avoid processTranscript which mixed up stdout and stderr and could have
led to weirdness if there were warnings that didn't get suppressed.
2017-12-31 15:19:01 -04:00
Joey Hess
fcdd9ce788
repeated addurl behavior reversion fix
addurl: When the file youtube-dl will download is already an annexed file,
don't download it again and fail to overwrite it, instead just do nothing,
like it used to when quvi was used.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-12-31 14:55:51 -04:00
Joey Hess
cc6f5d6e49
avoid trying youtube-dl for ftp and file url schemes
This commit was sponsored by John Peloquin on Patreon.
2017-12-11 12:46:34 -04:00
Joey Hess
8990afaef0
fix regression in addurl --fast caused by youtube-dl support
Similar to c6e4bc0a22 but another code
path. As well as using youtube-dl unecessarily, it used the filename it
comes up with, which while nice for youtube videos, is not right for
other files.

This means more work is done for urls that youtube-dl does support,
but is probably more efficient for other urls, since it only downloads
the first chunk of content, while youtube-dl probably downloads more.

This commit was supported by the NSF-funded DataLad project.
2017-12-08 14:51:17 -04:00
Joey Hess
c6e4bc0a22
fix regression in addurl --file caused by youtube-dl support
Now youtubeDlCheck downloads the beginning of the url's content and
checks if it's html, only when it is does it pass it off the youtube-dl
to check if it supports it.

This means more work is done for urls that youtube-dl does support,
but is probably more efficient for other urls, since it only downloads
the first chunk of content, while youtube-dl probably downloads more.

As well as the reported bug, this also fixes behavior when an url
was added with youtube-dl, but the url content has now changed from
a html page to something else. Remote.Web.checkKey used to wrongly
succeed in that situation, since youtube-dl said sure it can download
that something else.

This commit was supported by the NSF-funded DataLad project.
2017-12-06 13:22:31 -04:00
Joey Hess
fc845e6530
more lambda-case conversion 2017-12-05 15:00:50 -04:00
Joey Hess
1228fe8c86
honor annex.diskreserve when running youtube-dl
This commit was sponsored by André Pereira on Patreon.
2017-11-30 16:14:36 -04:00
Joey Hess
bbedc1c265
check youtube-dl for --fast and --relaxed when adding new file
The filename comes from youtube-dl also.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-11-30 14:57:20 -04:00
Joey Hess
2528e3ddb0
rethought --relaxed change
Better to make it not be surprising and slow, than surprising and fast.
--raw can be used when it needs to be really fast.

Implemented adding a youtube-dl supported url to an existing file.

This commit was sponsored by andrea rota.
2017-11-30 14:13:20 -04:00
Joey Hess
8a0038ec23
avoid warning when youtube-dl is not installed
If a user does not have it installed, don't warn on every imported item
about it.
2017-11-30 13:43:55 -04:00
Joey Hess
22a9389bc7
fix build 2017-11-30 13:21:19 -04:00
Joey Hess
31b4d7c6d0
pass git config options to youtube-dl --simulate
Decided not to --ignore-config by default. It the user has something in
their youtube-dl config files that breaks git-annex they can configure
it to use that option.
2017-11-29 20:07:03 -04:00
Joey Hess
99bebdface
youtube-dl working
Including resuming and cleanup of incomplete downloads.

Still todo: --fast, --relaxed, importfeed, disk reserve checking,
quvi code cleanup.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-11-29 16:40:32 -04:00