respect urlinclude/urlexclude of other web special remotes

When a web special remote does not have urlinclude/urlexclude
configured, make it respect the configuration of other web special
remotes and avoid using urls that match the config of another.

Note that the other web special remote does not have to be enabled.
That seems ok, it would have been extra work to check for only ones that
are enabled.

The implementation does mean that the web special remote re-parses
its own config once at startup, as well as re-parsing the configs of any
other web special remotes. This should be a very small slowdown
unless there are lots of web special remotes.

Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
Joey Hess 2023-01-10 14:58:53 -04:00
parent 0fc476f16e
commit 8a305e5fa3
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 134 additions and 31 deletions

View file

@ -20,8 +20,23 @@ These parameters can be passed to `git annex initremote` or
* `urlinclude` - Only use urls that match the specified glob.
For example, `urlinclude="https://s3.amazonaws.com/*"`
Note: Globs are matched case-insensitively.
* `urlexclude` - Don't use urls that match the specified glob.
For example, to prohibit http urls, but allow https,
use `urlexclude="http:*"`
Note: Globs are matched case-insensitively.
Globs are matched case-insensitively.
When there are multiple special remotes of type web, and some are not
configured with `urlinclude` and/or `urlexclude`, those will avoid using
urls that are matched by the configuration of other web remotes.
For example, this creates a second web special remote named "slowweb" that
is only used for urls on one host, and that has a higher cost than the
"web" special remote. With this configuration, `git-annex get` will first
try to get the file from the "web" special remote, which will avoid
using any urls that match slowweb's urlinclude. Only if the content
can't be downloaded from "web" (or some other remote) will it fall back
to downloading from slowweb.
git-annex initremote --sameas=web slowweb type=web urlinclude='*//slowhost.com/*'
git config remote.slowweb.cost 300