Commit graph

1265 commits

Author SHA1 Message Date
Joey Hess
2a45b5ae9a
avoid failure to lock content of removed file causing drop etc to fail
This was already prevented in other ways, but as seen in commit
c30fd24d91, those were a bit fragile.
And I'm not sure races were avoided in every case before. At least a
race between two separate git-annex processes, dropping the same
content, seemed possible.

This way, if locking fails, and the content is not present, it will
always do the right thing. Also, it avoids the overhead of an unncessary
inAnnex check for every file.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-07-25 11:59:33 -04:00
Joey Hess
57cceac569
simplify interface by removing size
Add size to the returned key after the fact, unless the remote happened
to add it itself.
2020-07-03 14:22:22 -04:00
Joey Hess
85cd79ea01
no importKey for android yet
adb shell has sha256sum sha1sum and some others, so they could be used.
They're provided by toybox, so seem about as likely to keep
working as find and stat, which it already depends on.

Or to not add a dep, could use stat the same as getExportContentIdentifier
to get a mtime, and make a WORM key. But do I really want this to
default to WORM?

Unsure what's the best path, so punting for now.
2020-07-03 14:02:50 -04:00
Joey Hess
ddcab38e4a
no importKey for S3 for now
The Etag is sometimes a md5, but not if eg, there was a multipart
upload.

May revisit later if there's demand.
2020-07-03 13:53:14 -04:00
Joey Hess
85506a7015
import: Added --no-content option, which avoids downloading files from a special remote
Only supported by some special remotes: directory
I need to check the rest and they're currently missing methods until I do.

git-annex sync --no-content does not yet use this to do imports
2020-07-03 13:41:57 -04:00
Joey Hess
0f26782a73
fix windows build more 2020-07-02 12:01:09 -04:00
Joey Hess
00497fd38e
fix windows build 2020-07-02 11:46:26 -04:00
Joey Hess
8ad433d5f0
fix windows build 2020-07-02 11:35:05 -04:00
Joey Hess
8b22e0bf37
lockContent for tahoe
Trivial since git-annex cannot remove, but do an active checkKey verification
anyway, in case the data was lost somehow.

This commit was sponsored by Ryan Newton on Patreon.
2020-06-26 14:23:21 -04:00
Joey Hess
3175015d1b
lockContent for S3 (with versioning=yes) and git-lfs
Made several special remotes support locking content on them while
dropping, which allows dropping from another special remote when the
content will only remain on a special remote of these types.

In both cases, verify the content is present actively, because it's
certianly possible for things other than git-annex to have removed it.

Worth thinking about what to do if at some later point, git-lfs gains
support for dropping content, and a content locking operation.
That would probably need a transition; first would need to make lockContent
use the locking operation. Then, once enough time had passed that we can
assume any git-annex operating on the git-lfs remote had that change,
git-annex could finally allow dropping from git-lfs.

Or, it could be that git-lfs gains support for dropping content, but not
locking it. In that case, it seems this commit would need to be reverted,
and then wait long enough for that git-annex to be everywhere, and only
then can git-annex safely support dropping from git-lfs.

So, the assumption made in this commit could lead to bother later.. But I
think it's actually highly unlikely git-lfs does ever support dropping;
it's outside their centralized model. Probably. :) Worth keeping in mind as
the same assumption is made about other special remotes though.

This commit was sponsored by Ethan Aubin.
2020-06-26 13:46:42 -04:00
Joey Hess
01eb863a14
Build with the git-lfs library when available
Otherwise use the vendored copy as before.

The library is in Debian testing but not stable. Once it reaches
stable, the vendored copy can be removed.

Did not add it to debian/control because IIRC that's used to build
git-annex on stable too, possibly. However, the Debian maintainer will
probably want to make the package depend on libghc-git-lfs-dev.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-06-22 11:21:25 -04:00
Joey Hess
aa1ad0b7ca
remove redundant imports
Clean build under ghc 8.8.3, which seems to do better at finding cases
where two imports both provide the same symbol, and warns about one of
them.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-06-22 11:05:34 -04:00
Joey Hess
5c0dc7dc0b
fix a further bug 2020-06-16 18:14:36 -04:00
Joey Hess
ad81feb053
fix implicit embedcreds regression
Fix bug that made creds not be stored in git when a special remote was
initialized with gpg encryption, but without an explicit embedcreds=yes.

(Yet nother regression introduced in version 7.20200202.7. 5th so far.)
2020-06-16 18:00:19 -04:00
Joey Hess
a1d4c8e4ec
external: SETCREDS include creds in externalConfigChanges
This makes the creds get saved, since only things recorded there will be
saved.

IIRC, unparsedRemoteConfig was not originally available when I
implemented this; now that it is things get a bit simpler.

More could probably be simplified, is externalConfigChanges needed at
all?

This does not entirely fix the bugs though, because creds are only
embedded when embedcreds=yes, but not when encryption=pubkey is used
without embedcreds=yes.
2020-06-16 17:24:24 -04:00
Joey Hess
4773713cc9
analysis of regression and fix related less serious regression 2020-06-16 15:16:36 -04:00
Joey Hess
a76b1ba3d6
local git remote autoinit improvements
* Improve display of problems auto-initializing or upgrading local git
  remotes.
* When a local git remote cannot be initialized because it has no
  git-annex branch or a .noannex file, avoid displaying a message about it.
2020-06-16 13:24:00 -04:00
Joey Hess
41952204ce
S3: The REDUCED_REDUNDANCY storage class is no longer cheaper
So stop documenting it, and stop offering it as a choice in the assistant.

Removed the code that parses it into S3.ReducedRedundancy, because
S3.OtherStorageClass with the value will work just the same and avoids a
special case for a deprecated this.
2020-06-16 12:04:29 -04:00
Joey Hess
24ff5e2b29
use uninterruptibleMask
Some recent changes to use mask missed that async exceptions can still
be thrown inside it. The goal is to make sure a block of cleanup code
runs entirely, w/o being interrupted by an async exception, so use
uninterruptibleMask.

Also, converted a few to bracket, which is nicer.
2020-06-09 15:02:56 -04:00
Joey Hess
a49d300545
async exception safety for external special remote processes
Since an external process can be in the middle of some operation when an
async exception is received, it has to be shut down then. Using
cleanupProcess will close its IO handles and send it a SIGTERM.

If a special remote choses to catch SIGTERM, it's fine for it to do some
cleanup then, but until it finishes, git-annex will be blocked waiting
for it. If a special remote blocked SIGTERM, it would cause a hang.
Mentioned in docs.

Also, in passing, fixed a FD leak, it was not closing the error handle
when shutting down the external. In practice that didn't matter before because
it was only run when git-annex was itself shutting down, but now that it
can run on exception, it would have been a problem.
2020-06-09 12:22:14 -04:00
Joey Hess
3ed797be0f
fix reversion
From back in 4be94c67c7.
Caused the test suite to fail, when bup is installed, but was not
noticed since the autobuilds don't have bup.
2020-06-05 19:06:09 -04:00
Joey Hess
e41f8c83f3
close stdin handles before waiting on commands
Fixes reversion in recent conversions, the old code relied on the GC
apparently, but the new code explicitly waits on the process, so must
close stdin handle first or the command will never exit.
2020-06-05 17:27:49 -04:00
Joey Hess
ef0024444b
fix reversion
It was not the wrong handle. The handle was not being closed, so bup
kept running.

Before 2670890b17, the code was:

withHandle StdinHandle createProcessSuccess cmd feeder

The stdin handle was not closed by the feeder.

Testing this:

	withHandle StdinHandle createProcessSuccess (proc "cat" []) (\h -> hPutStrLn h "hi")

There's a rather long pause, a couple seconds, before it completes, but
it does complete. With hClose h, it immediately completes. This must be
the GC noticing that h is out of scope and closing it.

It seems likely that the old code worked only by that accident.

So, other similar changes made in that and nearby commits may also
have this problem, and need to explicitly close handles that were
somehow implicitly closed before.
2020-06-05 17:10:52 -04:00
Joey Hess
291774779f
use right handle 2020-06-05 16:45:12 -04:00
Joey Hess
1dd770b1af
fix file descriptor leak
when importing from a directory special remote that is configured with
exporttree=yes
2020-06-05 15:34:43 -04:00
Joey Hess
319f2a4afc
audit all uses of SomeException to avoid catching async exceptions
Except for the assistant, which I think may use them between threads?

Most of the uses of SomeException were already catching only async exceptions.
But I did find a few places that were accidentially catching them.
2020-06-05 15:16:57 -04:00
Joey Hess
dca19099a9
async exception safety
Masking ensures that EndStderrHandler gets written, so the helper
threads shut down.

However, nothing currently guarantees that calls to closeP2PSshConnection
are async exception safe, so made a note about it.

At this point, I've audited all calls to async, and made them all async
exception safe, except for ones in the assistant, and a few in leaf
commands (remotedaemon, enable-tor, multicast, p2p) which don't need to
be.
2020-06-05 14:56:41 -04:00
Joey Hess
2670890b17
convert to withCreateProcess for async exception safety
This handles all createProcessSuccess callers, and aside from process
pools, the complete conversion of all process running to async exception
safety should be complete now.

Also, was able to remove from Utility.Process the old API that I now
know was not a good idea. And proof it was bad: The code size went *down*,
despite there being a fair bit of boilerplate for some future API to
reduce.
2020-06-04 15:45:52 -04:00
Joey Hess
438dbe3b66
convert to withCreateProcess for async exception safety
This handles all sites where checkSuccessProcess/ignoreFailureProcess
is used, except for one: Git.Command.pipeReadLazy
That one will be significantly more work to convert to bracketing.

(Also skipped Command.Assistant.autoStart, but it does not need to
shut down the processes it started on exception because they are
git-annex assistant daemons..)

forceSuccessProcess is done, except for createProcessSuccess.
All call sites of createProcessSuccess will need to be converted
to bracketing.

(process pools still todo also)
2020-06-04 12:44:09 -04:00
Joey Hess
c429bbf2bd
remove workaround for old versions of process
ghc 8.4.4 has process 1.6.3, which was the first version to include
getPid.
2020-06-03 16:03:08 -04:00
Joey Hess
1ee5919d1e
make createProcess calls async exception safe
Using cleanupProcess because withCreateProcess cannot run an Annex
action, but the effect is the same as using it.
2020-06-03 15:30:30 -04:00
Joey Hess
484a74f073
auto-init autoenable=yes
Try to enable special remotes configured with autoenable=yes when git-annex
auto-initialization happens in a new clone of an existing repo. Previously,
git-annex init had to be explicitly run to enable them. That was a bit of a
wart of a special case for users to need to keep in mind.

Special remotes cannot display anything when autoenabled this way, to avoid
interfering with the output of git-annex query commands.

Any error messages will be hidden, and if it fails, nothing is displayed.
The user will realize the remote isn't enable when they try to use it,
and can run git-annex init manually then to try the autoenable again and
see what failed.

That seems like a reasonable approach, and it's less complicated than
communicating something across a pipe in order to display it as a side
message. Other reason not to do that is that, if the first command the
user runs is one like git-annex find that has machine readable output,
any message about autoenable failing would need to not be displayed anyway.
So better to not display a failure message ever, for consistency.

(Had to split out Remote.List.Util to avoid an import cycle.)
2020-05-27 12:40:35 -04:00
Joey Hess
c108fa16f1
improve error message when download from non-chunked remote fails
Avoid "chunk retrieval failed" in this case, when it tries to fall back
to using chunks.

This commit was sponsored by Ethan Aubin.
2020-05-21 14:44:40 -04:00
Joey Hess
e63dcbf36c
fix embedcreds=yes reversion
Fix bug that made enableremote of S3 and webdav remotes, that have
embedcreds=yes, fail to set up the embedded creds, so accessing the remotes
failed.

(Regression introduced in version 7.20200202.7 in when reworking all the
remote configs to be parsed.)

Root problem is that parseEncryptionConfig excludes all other config keys
except encryption ones, so it is then unable to find the
credPairRemoteField. And since that field is not required to be
present, it proceeds as if it's not, rather than failing in any visible
way.

This causes it to not find any creds, and so it does not cache
them. When when the S3 remote tries to make a S3 connection, it finds no
creds, so assumes it's being used in no-creds mode, and tries to find a
public url. With no public url available, it fails, but the failure doesn't
say a lack of creds is the problem.

Fix is to provide setRemoteCredPair with a ParsedRemoteConfig, so the full
set of configs of the remote can be parsed. A bit annoying to need to
parse the remote config before the full config (as returned by
setRemoteCredPair) is available, but this avoids the problem.

I assume webdav also had the problem by inspection, but didn't try to
reproduce it with it.

Also, getRemoteCredPair used getRemoteConfigValue to get a ProposedAccepted
String, but that does not seem right. Now that it runs that code, it
crashed saying it had just a String.

Remotes that have already been enableremoted, and so lack the cached creds
file will work after this fix, because getRemoteCredPair will extract
the creds from the remote config, writing the missing file.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-05-21 14:35:30 -04:00
Joey Hess
6361074174
convert renameExport to throw exception
Finishes the transition to make remote methods throw exceptions, rather
than silently hide them.

A bit on the fence about this one, because when renameExport fails,
it falls back to deleting instead, and so does the user care why it failed?

However, it did let me clean up several places in the code.

This commit was sponsored by Ethan Aubin.
2020-05-15 15:08:09 -04:00
Joey Hess
037440ef36
convert removeExportDirectory to throw exception
Part of ongoing transition to make remote methods
throw exceptions, rather than silently hide them.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-05-15 14:43:18 -04:00
Joey Hess
cdbfaae706
change removeExport to throw exception
Part of ongoing transition to make remote methods
throw exceptions, rather than silently hide them.

This commit was sponsored by Graham Spencer on Patreon.
2020-05-15 14:15:14 -04:00
Joey Hess
3334d3831b
change retrieveExport and getKey to throw exception
retrieveExport is part of ongoing transition to make remote methods
throw exceptions, rather than silently hide them.

getKey very rarely fails, and when it does it's always for the same reason
(user configured annex.backend to url for some reason). So, this will
avoid dealing with Nothing everywhere it's used.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-05-15 13:45:53 -04:00
Joey Hess
4814b444dd
make storeExport throw exceptions 2020-05-15 12:20:02 -04:00
Joey Hess
4be94c67c7
make removeKey throw exceptions 2020-05-14 14:11:05 -04:00
Joey Hess
d9c7f81ba4
make retrieveKeyFile and retrieveKeyFileCheap throw exceptions
Converted retrieveKeyFileCheap to a Maybe, to avoid needing to throw a
exception when a remote doesn't support it.
2020-05-13 17:07:07 -04:00
Joey Hess
c1cd402081
make storeKey throw exceptions
When storing content on remote fails, always display a reason why.

Since the Storer used by special remotes already did, this mostly affects
git remotes, but not entirely. For example, if git-lfs failed to connect to
the endpoint, it used to silently return False.
2020-05-13 14:03:00 -04:00
Joey Hess
b50ee9cd0c
remove Preparer abstraction
That had almost no benefit at all, and complicated things quite a lot.

What I proably wanted this to be was something like ResourceT, but it
was not. The few remotes that actually need some preparation done only
once and reused used a MVar and not Preparer.
2020-05-13 11:56:21 -04:00
Joey Hess
be5caeaf51
catch more exceptions
Just in case a non-IO exception might somehow be thrown.
2020-05-12 13:05:06 -04:00
Joey Hess
5f5170b22b
remove SafeFilePath
Move sanitizeFilePath call to where fromSafeFilePath had been.
2020-05-11 14:04:56 -04:00
Joey Hess
69e2e4763e
only check --force at init time, not enable time
git-lfs repos that encrypt the annexed content but not the git repo only
need --force passed to initremote, allow enableremote and autoenable of
such remotes without forcing again.

Needing --force again particularly made autoenable of such a repo not work.
And once such a repo has been set up, it seems a second --force when
enabling it elsewhere has little added value. It does tell the user about
the possibly insecure configuration, but if the git repo has already been
pushed to that remote in the clear, data has already been exposed. The goal
of that --force was not to prevent every situation where such an exposure
can happen -- anyone who sets up a public git repo and pushes to it will
expose things similarly and git-annex is not involved. Instead, the purpose
of the --force is to point out to the user that they're asking for a
configuration where encryption is inconsistently applied.
2020-05-07 15:59:29 -04:00
Joey Hess
1532d67c3e
S3: Support signature=v4
To use S3 Signature Version 4. Some S3 services seem to require v4, while
others may only support v2, which remains the default.

I'm also not sure if v4 works correctly in all cases, there is this
upstream bug report: https://github.com/aristidb/aws/issues/262
I've only tested it against the default S3 endpoint.
2020-05-07 13:18:11 -04:00
Joey Hess
f9ed30de3b
avoid beware of the leopard situation
* Display a warning message when a remote uses a protocol, such as
  git://, that git-annex does not support. Silently skipping such a
  remote was confusing behavior.

  It sets annex-ignore, so the warning is only displayed once.

* Also display a warning message when a remote, without a known uuid,
  is located in a directory that does not currently exist, to avoid
  silently skipping such a remote.

  This is a bit more debatable, since git-annex get will say,
  try making repository available. And since it does not set annex-ignore,
  the warning will be displayed repeatedly. It's also an extreme edge case,
  I don't think I've ever seen it happen in real life.
2020-05-04 13:01:11 -04:00
Joey Hess
2aeb79249b
external: stop storing readonly=true in remote.log
readonly=true is used to make an external special remote that does not
need the external program to be installed. It was stored in the
remote.log by default, and so every time it was specified in an
enableremote or initremote, whatever value was used became the new
default for subsequent enableremotes of that remote.

That was surprising, and I consider it to be a bug.

It does not make much sense to pass it to initremote because then how
would you populate that remote with anything? You would have to
enableremote elsewhere, and store content there. I'm assuming nobody
used it that way.

Someone might rely on passing it to enableremote once, and then that
being inherited in other clones. But that is not how it's documented to
be used. It is barely documented in git-annex at all, only in the
external special remote protocol, and the documentation there says to
"Document that this external special remote can be used in readonly
mode." (by the user of it passing readonly=true to enableremote). The
one external special remote that I know of that does document that is
<https://github.com/bgilbert/gcsannex> (the one that motivated adding
it). That one's docs do say to pass it to enableremote.

So, it seemed safe to make this behavior change. If someone was in fact
relying on one of those behaviors, all their current repos will still
work as they configured them (although they will need to deal
with the related change in 9f3c2dfeda).
In new clones, they will find enableremote fails, complaining the
external program is not in path. An easy enough problem to recover from.
2020-04-23 15:21:26 -04:00
Joey Hess
9f3c2dfeda
stop using remote.name.annex-readonly for two distinct things 2020-04-23 14:56:03 -04:00