Fix bug that made enableremote of S3 and webdav remotes, that have
embedcreds=yes, fail to set up the embedded creds, so accessing the remotes
failed.
(Regression introduced in version 7.20200202.7 in when reworking all the
remote configs to be parsed.)
Root problem is that parseEncryptionConfig excludes all other config keys
except encryption ones, so it is then unable to find the
credPairRemoteField. And since that field is not required to be
present, it proceeds as if it's not, rather than failing in any visible
way.
This causes it to not find any creds, and so it does not cache
them. When when the S3 remote tries to make a S3 connection, it finds no
creds, so assumes it's being used in no-creds mode, and tries to find a
public url. With no public url available, it fails, but the failure doesn't
say a lack of creds is the problem.
Fix is to provide setRemoteCredPair with a ParsedRemoteConfig, so the full
set of configs of the remote can be parsed. A bit annoying to need to
parse the remote config before the full config (as returned by
setRemoteCredPair) is available, but this avoids the problem.
I assume webdav also had the problem by inspection, but didn't try to
reproduce it with it.
Also, getRemoteCredPair used getRemoteConfigValue to get a ProposedAccepted
String, but that does not seem right. Now that it runs that code, it
crashed saying it had just a String.
Remotes that have already been enableremoted, and so lack the cached creds
file will work after this fix, because getRemoteCredPair will extract
the creds from the remote config, writing the missing file.
This commit was sponsored by Ilya Shlyakhter on Patreon.
Finishes the transition to make remote methods throw exceptions, rather
than silently hide them.
A bit on the fence about this one, because when renameExport fails,
it falls back to deleting instead, and so does the user care why it failed?
However, it did let me clean up several places in the code.
This commit was sponsored by Ethan Aubin.
Part of ongoing transition to make remote methods
throw exceptions, rather than silently hide them.
This commit was sponsored by Graham Spencer on Patreon.
retrieveExport is part of ongoing transition to make remote methods
throw exceptions, rather than silently hide them.
getKey very rarely fails, and when it does it's always for the same reason
(user configured annex.backend to url for some reason). So, this will
avoid dealing with Nothing everywhere it's used.
This commit was sponsored by Ilya Shlyakhter on Patreon.
When storing content on remote fails, always display a reason why.
Since the Storer used by special remotes already did, this mostly affects
git remotes, but not entirely. For example, if git-lfs failed to connect to
the endpoint, it used to silently return False.
That had almost no benefit at all, and complicated things quite a lot.
What I proably wanted this to be was something like ResourceT, but it
was not. The few remotes that actually need some preparation done only
once and reused used a MVar and not Preparer.
To use S3 Signature Version 4. Some S3 services seem to require v4, while
others may only support v2, which remains the default.
I'm also not sure if v4 works correctly in all cases, there is this
upstream bug report: https://github.com/aristidb/aws/issues/262
I've only tested it against the default S3 endpoint.
Fix serious regression in gcrypt and encrypted git-lfs remotes.
Since version 7.20200202.7, git-annex incorrectly stored content
on those remotes without encrypting it.
Problem was, Remote.Git enumerates all git remotes, including git-lfs
and gcrypt. It then dispatches to those. So, Remote.List used the
RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt,
and that parser does not know about encryption fields, so did not
include them in the ParsedRemoteConfig. (Also didn't include other
fields specific to those remotes, perhaps chunking etc also didn't
get through.)
To fix, had to move RemoteConfig parsing down into the generate methods
of each remote, rather than doing it in Remote.List.
And a consequence of that was that ParsedRemoteConfig had to change to
include the RemoteConfig that got parsed, so that testremote can
generate a new remote based on an existing remote.
(I would have rather fixed this just inside Remote.Git, but that was not
practical, at least not w/o re-doing work that Remote.List already did.
Big ugly mostly mechanical patch seemed preferable to making git-annex
slower.)
using git credential to get the password
One thing this doesn't do is wrap the password prompting inside the prompt
action. So with -J, the output can be a bit garbled.
Special remote programs that use GETCONFIG/SETCONFIG are recommended
to implement it.
The description is not yet used, but will be useful later when adding a way
to make initremote list all accepted configs.
configParser now takes a RemoteConfig parameter. Normally, that's not
needed, because configParser returns a parter, it does not parse it
itself. But, it's needed to look at externaltype and work out what
external remote program to run for LISTCONFIGS.
Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported
used to use NoUUID. The code that now checks for Nothing used to behave
in some undefined way if the external program made requests that
triggered it.
Also, note that in externalSetup, once it generates external,
it parses the RemoteConfig strictly. That generates a
ParsedRemoteConfig, which is thrown away. The reason it's ok to throw
that away, is that, if the strict parse succeeded, the result must be
the same as the earlier, lenient parse.
initremote of an external special remote now runs the program three
times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again
LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least
one of those, and it should be possible to only run the program once.
Using field functions consistently avoids possibility of typos and also
helps ensure that all fields are added to RemoteConfigParsers (as long
as I have remembered to add them when writing the functions).
Avoids parse error when the fields are added to RemoteConfig at setup
time and it then gets parsed, also at setup time. After setup time, such
internally added fields are not a problem, because they're Accepted. So
it may not be necessary in all cases to list such internally added
fields, but I think it's a good idea to always do so.
This is a first step toward that goal, using the ProposedAccepted type
in RemoteConfig lets initremote/enableremote reject bad parameters that
were passed in a remote's configuration, while avoiding enableremote
rejecting bad parameters that have already been stored in remote.log
This does not eliminate every place where a remote config is parsed and a
default value is used if the parse false. But, I did fix several
things that expected foo=yes/no and so confusingly accepted foo=true but
treated it like foo=no. There are still some fields that are parsed with
yesNo but not not checked when initializing a remote, and there are other
fields that are parsed in other ways and not checked when initializing a
remote.
This also lays groundwork for rejecting unknown/typoed config keys.
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.
Previously attempted in 4536c93bb2
and reverted in 96aba8eff7.
The problems mentioned in the latter commit are addressed now:
Read/Show of KeyData is backwards-compatible with Read/Show of Key from before
this change, so Types.Distribution will keep working.
The Eq instance is fixed.
Also, Key has smart constructors, avoiding needing to remember to update
the cached serialization.
Used git-annex benchmark:
find is 7% faster
whereis is 3% faster
get when all files are already present is 5% faster
Generally, the benchmarks are running 0.1 seconds faster per 2000 files,
on a ram disk in my laptop.
Convert Utility.Url to return Either String so the error message can be
displated in the annex monad and so captured.
(When curl is used, its errors are still not caught.)
This solves the problem of sameas remotes trampling over per-remote
state. Used for:
* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
could in theory generate the same content identifier for two different
peices of content
While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.
External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.
Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
I found a way to avoid inheritance complicating anything outside of
Logs.Remote. It seems fine to require all inherited values to be
inherited and not set in the sameas remote's config. Since inherited
values will be used for stuff like encryption and perhaps chunking, which
control the actual content stored on the remote, it seems likely that
there will not be any reason to need them to vary between two remotes
that access the same underlying data store.
The newer version of containers is free; the minimum ghc version is
bundled with a newer version than that.
Prompted by the test suite on windows failing to with "export foo failed"
and no information about what went wrong.
Note that only storeExportWithContentIdentifier has been converted.
storeExport still returns a Bool and so exceptions may be hidden.
However, storeExportWithContentIdentifier has many more failure modes,
since it needs to avoid overwriting modified files. So it's more
important it have better error display.
Drop support for building with ghc older than 8.4.4, and with older
versions of serveral haskell libraries than will be included in Debian 10.
The only remaining version ifdefs in the entire code base are now a couple
for aws!
This commit should only be merged after the Debian 10 release.
And perhaps it will need to wait longer than that; it would make
backporting new versions of git-annex to Debian 9 (stretch) which
has been actively happening as recently as this year.
This commit was sponsored by Ilya Shlyakhter.
* Added mimeencoding= term to annex.largefiles expressions.
This is probably mostly useful to match non-text files with eg
"mimeencoding=binary"
* git-annex matchexpression: Added --mimeencoding option.
Unfortunately, "port" has to be set by default, or the old git-annex
will crash when trying to enable the S3 remote.
So, when protocol=https is specified, it needs to override port=80,
since it may be a default setting.
protocol=https implies port=443 and
port=443 implies protocol=https
-- this was necessary because the existing configs set port=443, but
with a protocol setting, users will naturally want to use it, and then
there's no need for them to supply the default https port. So we keep
back-compat, add a nicer way to enable https, and also add support for
non-standard https ports.
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.
Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.
(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
Avoid a warning message when renameExport is not supported, and just
fallback to deleting with a subsequent re-upload. Especially needed for
importtree remotes, where renameExport needs to be disabled.
This changes the external special remote protocol, but in a
backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older
version of git-annex will cause it to make renameExport return False.