git-annex/doc/git-annex-addurl.mdwn
Joey Hess a325524454
--json-exceptions
Added a --json-exceptions option, which makes some exceptions be output in json.

The distinction is that --json-error-messages is for messages relating
to a particular ActionItem, while --json-exceptions is for messages that
are not, eg ones for a file that does not exist.

It's unfortunate that we need two switches with such a fine distinction
between them, but I'm worried about maintaining backwards compatability
in the json output, to avoid breaking anything that parses it, and this was
the way to make sure I didn't.

toplevelWarning is generally used for the latter kind of message. And
the other calls to toplevelWarning could be converted to showException. The
only possible gotcha is that if toplevelWarning is ever called after
starting acting on a file, it will add to the --json-error-messages of the
json displayed for that file and converting to showException would be a
behavior change. That seems unlikely, but I didn't convery everything to
avoid needing to satisfy myself it was not a concern.

Sponsored-by: Dartmouth College's Datalad project
2023-04-25 17:05:33 -04:00

172 lines
4.9 KiB
Markdown

# NAME
git-annex addurl - add urls to annex
# SYNOPSIS
git annex addurl `[url ...]`
# DESCRIPTION
Downloads each url to its own file, which is added to the annex.
When `youtube-dl` is installed, it can be used to check for a video
embedded in a web page at the url, and that is added to the annex instead.
(However, this is disabled by default as it can be a security risk.
See the documentation of annex.security.allowed-ip-addresses
in [[git-annex]](1) for details.)
Special remotes can add other special handling of particular urls. For
example, the bittorrent special remotes makes urls to torrent files
(including magnet links) download the content of the torrent,
using `aria2c`.
Normally the filename is based on the full url, so will look like
"www.example.com_dir_subdir_bigfile". In some cases, addurl is able to
come up with a better filename based on other information. Options can also
be used to get better filenames.
# OPTIONS
* `--fast`
Avoid immediately downloading the url. The url is still checked
(via HEAD) to verify that it exists, and to get its size if possible.
* `--relaxed`
Don't immediately download the url, and avoid storing the size of the
url's content. This makes git-annex accept whatever content is there
at a future point.
This is the fastest option, but it still has to access the network
to check if the url contains embedded media. When adding large numbers
of urls, using `--relaxed --raw` is much faster.
* `--raw`
Prevent special handling of urls by youtube-dl, bittorrent, and other
special remotes. This will for example, make addurl
download the .torrent file and not the contents it points to.
* `--no-raw`
Require content pointed to by the url to be downloaded using youtube-dl
or a special remote, rather than the raw content of the url. if that
cannot be done, the add will fail.
* `--file=name`
Use with a filename that does not yet exist to add a new file
with the specified name and the content downloaded from the url.
If the file already exists, addurl will record that it can be downloaded
from the specified url(s).
* `--preserve-filename`
When the web server (or torrent, etc) provides a filename, use it as-is,
avoiding sanitizing unusual characters, or truncating it to length, or any
other modifications.
git-annex will still check the filename for safety, and if the filename
has a security problem such as path traversal or a control character,
it will refuse to add it.
* `--pathdepth=N`
Rather than basing the filename on the whole url, this causes a path to
be constructed, starting at the specified depth within the path of the
url.
For example, adding the url http://www.example.com/dir/subdir/bigfile
with `--pathdepth=1` will use "dir/subdir/bigfile",
while `--pathdepth=3` will use "bigfile".
It can also be negative; `--pathdepth=-2` will use the last
two parts of the url.
* `--prefix=foo` `--suffix=bar`
Use to adjust the filenames that are created by addurl. For example,
`--suffix=.mp3` can be used to add an extension to the file.
* `--no-check-gitignore`
By default, gitignores are honored and it will refuse to download an
url to a file that would be ignored. This makes such files be added
despite any ignores.
* `--jobs=N` `-JN`
Enables parallel downloads when multiple urls are being added.
For example: `-J4`
Setting this to "cpus" will run one job per CPU core.
* `--batch`
Enables batch mode, in which lines containing urls to add are read from
stdin.
* `-z`
Makes the `--batch` input be delimited by nulls instead of the usual
newlines.
* `--with-files`
When batch mode is enabled, makes it parse lines of the form: "$url $file"
That adds the specified url to the specified file, downloading its
content if the file does not yet exist; the same as
`git annex addurl $url --file $file`
* `--json`
Enable JSON output. This is intended to be parsed by programs that use
git-annex. Each line of output is a JSON object corresponding to an url
being processed.
* `--json-progress`
Include progress objects in JSON output.
* `--json-error-messages`
Adds an "error-messages" field to the JSON that contains messages that
would normally be output to the standard error when processing an url.
* `--json-exceptions`
Output additional JSON objects for some exceptions that are not
associated with a particular url.
* `--backend`
Specifies which key-value backend to use.
* Also the [[git-annex-common-options]](1) can be used.
# CAVEATS
If annex.largefiles is configured, and does not match a file, `git annex
addurl` will add the non-large file directly to the git repository,
instead of to the annex. However, this is not done when --fast or --relaxed
is used.
# SEE ALSO
[[git-annex]](1)
[[git-annex-rmurl]](1)
[[git-annex-registerurl]](1)
[[git-annex-importfeed]](1)
# AUTHOR
Joey Hess <id@joeyh.name>
Warning: Automatically converted into a man page by mdwn2man. Edit with care.