Commit graph

1305 commits

Author SHA1 Message Date
Joey Hess
87f91ce563
more RawFilePath conversion
451/645
2020-10-30 15:55:59 -04:00
Joey Hess
ca80c3154c
more RawFilePath conversion
removeFile changed to removeLink, because AFAICS it should be fine to
remove non-file things here. In particular, it's fine to remove a
symlink, since we're about to write a symlink. (removeLink does not
remove directories, so file, symlink, and unix socket are the only
possibilities.)
2020-10-30 13:07:41 -04:00
Joey Hess
8f452416f7
more RawFilePath conversion
Better to use Git.repoPath to get a filepath, Git.repoLocation does not
always return one.

This commit was sponsored by Ethan Aubin.
2020-10-30 13:00:12 -04:00
Joey Hess
19694fb280
more RawFilePath conversion
At this point I'll be done by new year's.

This commit was sponsored by Ethan Aubin.
2020-10-30 12:51:34 -04:00
Joey Hess
681b44236a
more RawFilePath conversion
at 377/645

This commit was sponsored by Svenne Krap on Patreon.
2020-10-29 14:20:57 -04:00
Joey Hess
b05015f772
fix name of lock file
It was the stringification of a UUID, so "UUID \"foo\""
2020-10-29 10:53:01 -04:00
Joey Hess
e505c03bcc
more RawFilePath conversion
nukeFile replaced with removeWhenExistsWith removeLink, which allows
using RawFilePath. Utility.Directory cannot use RawFilePath since setup
does not depend on posix.

This commit was sponsored by Graham Spencer on Patreon.
2020-10-29 10:50:29 -04:00
Joey Hess
8d66f7ba0f
more RawFilePath conversion
Added a RawFilePath createDirectory and kept making stuff build.

Up to 296/645

This commit was sponsored by Mark Reidenbach on Patreon.
2020-10-28 17:25:59 -04:00
Joey Hess
b2bf099aa3
use removeDirGeneric here too for consistency
And because it might be more robust on windows.
2020-10-23 16:12:47 -04:00
Joey Hess
4d063f12c6
turns out this was fixed in 2014 2020-10-22 19:54:26 -04:00
Joey Hess
b62e004c2c
update chunk log after speculated chunks are verified to be present
Only done in checkPresentChunks, although retrieveChunks could also do
it. Does not seem necessary though, because git-annex never retrives
content without first checking if it's present AFAICR. And really this will
only be needed when using fsck. Puttting it here, rather than in fsck
avoids breaking an abstraction boundary, and is nice and inexpensive.
2020-10-22 13:37:09 -04:00
Joey Hess
dad4be97c2
speculatively use remote's configured chunk size as a fallback
When a special remote has chunking enabled, but no chunk sizes are
recorded (or the recorded ones are not found), speculatively try chunks
using the configured chunk size.

This makes eg, git-annex fsck --from remote be able to fix up the
location log of a file that the git-annex branch does not indicate is
stored on the remote.

Note that fsck does *not* fix up the chunk log to indicate the chunk
size. So, changing the chunk config of the remote after that will still
prevent accessing the chunks stored on it. Maybe fsck should, but I
wanted to start with this and see if it's needed.
2020-10-22 13:11:06 -04:00
Joey Hess
2dd38b6403
switch to Haskell2010
When I put in Haskell98 this spring, I was under the mistaken
apprehension that ghc defaulted to that. But it actually its default
is a third mode, which is closer to Haskell2010 but with some differences.
The manual says "By default, GHC mainly aims to behave (mostly) like a
Haskell 2010 compiler"

Fixed two cases where the Haskell98 do indentation flexability let
wrongly indented code build. That is one of the places where
ghc does not behave like Haskell2010 by default.

The other place that I think I was concerned about, is GHC manual
section 19.1.1.3. Expressions and patterns. But that only seems to
affect code using bottoms, so would only affect pure functions throwing
an error, which I don't think git-annex does in many places as it's
pretty horrid style. And it would only affect rare cases like shown in
that section. If it did happen, it would mean that the error was not
thrown before specifying Haskell98, and then was. Haskell2010 behaves
the same as Haskell98.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-10-19 11:26:16 -04:00
Joey Hess
4c32499e82
Parse youtube-dl progress output
Which lets progress be displayed when doing concurrent downloads.
Amoung other things, like --json-progress etc.

The youtube-dl output is no longer displayed, except for any errors.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-29 17:53:48 -04:00
Joey Hess
084b502c7a
httpalso: Support being used with special remotes that do not have encryption= in their config. 2020-09-29 13:56:27 -04:00
Joey Hess
5cfcf1f05f
cache remote.log
Unlikely to speed up any of the existing uses much, but I want to use it
in a message that might be displayed many times.
2020-09-22 13:52:26 -04:00
Joey Hess
5844a54869
aws-0.22 improved its support for setting etags, which improves support for versioned S3 buckets.
Remove placeholder version number I used when implementing the feature in
aws.

This commit was sponsored by Ethan Aubin.
2020-09-14 18:37:49 -04:00
Joey Hess
ddf963d019
deepseq all things returned from ResourceT http
Potentially fixes https://git-annex.branchable.com/bugs/concurrent_git-annex-copy_to_s3_special_remote_fails/
although I don't know if it does.

My thinking is, ResourceT may allocate a resource and then free it,
and a unforced thunk to that resource could result in reading memory
that has since been overwritten by something else, or in a SEGV,
depending. While that seems kind of like a bug in ResourceT to me, if it
is what's happening, this will avoid it. If it's not, this doesn't
really hurt much since the values are all smallish.

This commit was sponsored by Graham Spencer on Patreon.
2020-09-14 18:30:06 -04:00
Joey Hess
6ea511beb4
Removed the S3 and WebDAV build flags
So these special remotes are always supported.

IIRC these build flags were added because the dep chains were a bit too
long, or perhaps because the libraries were not available in Debian stable,
or something like that. That was long ago, those reasons no longer apply,
and users get confused when builtin special remotes are not available, so
it seems best to remove the build flags now.

If this does cause a problem it can be reverted of course..

This commit was sponsored by Jochen Bartl on Patreon.
2020-09-08 12:42:59 -04:00
Joey Hess
eed20fe3b7
fix some file modes in calls to withTmpFileIn to honor umask
Also audited for other calls to openTempFile, and all are ok,
except for viaTmp which will need further work.

Remote.Directory fixed to set umask mode when writing to an export,
although it has another one using viaTmp that's not fixed.
Will make exports that are published via a http server running as
another user work, for example.

Remote.BitTorrent fixed to set umask mode when downloading the torrent
file. Normally this does not matter as that file does not hang around
after the download, but if a bittorrent download were started by one user,
got interrupted and then another user ran it, this will let them access
the torrent file created by the first user.
2020-09-02 14:36:08 -04:00
Joey Hess
26724fb331
display actual download errors
Eg, when config prohibits accessing localhost, need to show that
message, not a generic "download failed".
2020-09-02 12:21:10 -04:00
Joey Hess
31e5785bf7
avoid multiple download failed messages when learning
Also only display one progress meter for all download attempts, to avoid
a bunch of blank lines.
2020-09-02 12:01:50 -04:00
Joey Hess
854cd2ad47
httpalso: support exporttree=yes
Also tested what happens if the other special remote has importtree=yes
and exporttree=yes, and in that case, download via httpalso works too,
without needing to implement any importtree methods here.

It might be possible to make it automatically set exporttree=yes if the
--sameas does. Didn't try, will probably be layering issues.

Or perhaps it should be inherited by sameas like some
other configs? But then, wouldn't it also make sense to inherit
importree=yes? But as shown here, it's not needed by this kind of
remote.
2020-09-02 11:26:00 -04:00
Joey Hess
8656afd3e1
rename http special remote to httpalso
"http" was too generic and easy to confuse with web. The new name makes
clear it's used in addition to some other remote. And other protocols
can use the same naming scheme.
2020-09-02 10:41:53 -04:00
Joey Hess
571ec900ac
Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https.
With automatic layout learning!
2020-09-01 15:16:35 -04:00
Joey Hess
d00ce82418
fix hang if external program is not available
startExternal' throws an exception, which left the externalAsync TMVar
empty, so the next try to use it would hang.
2020-08-19 12:20:07 -04:00
Joey Hess
ad64079b44
fix some warnings 2020-08-15 14:33:18 -04:00
Joey Hess
f241a3cd3d
Display warning when external special remote does not start up properly, or is not usable
I'm sure this used to work, but somewhere along the line something or
things (getCost and getAvailability I think, probably others)
started catching the exception and not displaying it. So, show warnings.
2020-08-14 15:38:31 -04:00
Joey Hess
198b709561
switch to TMVars for thread safety when using the async extension
TVars were not updated atomically, which was ok when each thread got its
own External that was the only thing using these TVars. But, with the
async extension, several External instances can share the same var, so
it needs to be a TMVar to avoid read/write conflicts.

In particular, this makes PREPARE only be sent once.
2020-08-14 14:50:09 -04:00
Joey Hess
7da2d4dd2d
one jobid per thread
And, relay ERROR on to all listening threads.
2020-08-14 14:24:46 -04:00
Joey Hess
72561563d9
rethought the async protocol some more
Moving jobid generation to the git-annex side lets it be simplified a
lot.

Note that it will also be possible to generate one jobid per connection,
rather than a new job per request. That will make overflow not an issue,
and will avoid some work, and will simplify some of the code.
2020-08-13 20:18:06 -04:00
Joey Hess
59cbb42ee2
async proto fully tested and working
Including with a concurrent capable remote program.

However, this is not quite ready to merge, there's a TODO in the code.
2020-08-13 16:22:11 -04:00
Joey Hess
7546e686a2
async proto basically working
Simplified the protocol by removing END-ASYNC.

There's a STM crash when a non-async protocol message is sent, which
needs to be fixed.
2020-08-13 15:52:12 -04:00
Joey Hess
c9e8cafb98
further work on external async relay 2020-08-12 16:25:53 -04:00
Joey Hess
15706e6991
relayer receive loop is done
Receive loop looks right. Still need the send loop.

And, a complication is that some messages git-annex
sends need to be wrapped in REPLY_ASYNC, while others
do not. So will probably need to split externalSend
into two.
2020-08-12 15:56:58 -04:00
Joey Hess
06a4ab39fa
wip external remote async protocol extension 2020-08-12 15:17:53 -04:00
Joey Hess
3f8c808bd7
generalized ExternalState to not be limited to a ExternalAddonProcess
Idea is for ASYNC extension, it will instead contain methods that communicate
with the thread that handles all communication with the external process.
2020-08-12 12:30:45 -04:00
Joey Hess
5f4228dc2b
types for async protocol extension
renamed AsyncMessage to ExceptionalMessage to make way for this new
extension.
2020-08-12 12:04:12 -04:00
Joey Hess
f75be32166
external backends wip
It's able to start them up, the only thing not implemented is generating
and verifying keys. And, the key translation for HasExt.
2020-07-29 15:23:18 -04:00
Joey Hess
555fe669e1
refactoring in preparation for external backends 2020-07-29 12:00:27 -04:00
Joey Hess
2a45b5ae9a
avoid failure to lock content of removed file causing drop etc to fail
This was already prevented in other ways, but as seen in commit
c30fd24d91, those were a bit fragile.
And I'm not sure races were avoided in every case before. At least a
race between two separate git-annex processes, dropping the same
content, seemed possible.

This way, if locking fails, and the content is not present, it will
always do the right thing. Also, it avoids the overhead of an unncessary
inAnnex check for every file.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-07-25 11:59:33 -04:00
Joey Hess
57cceac569
simplify interface by removing size
Add size to the returned key after the fact, unless the remote happened
to add it itself.
2020-07-03 14:22:22 -04:00
Joey Hess
85cd79ea01
no importKey for android yet
adb shell has sha256sum sha1sum and some others, so they could be used.
They're provided by toybox, so seem about as likely to keep
working as find and stat, which it already depends on.

Or to not add a dep, could use stat the same as getExportContentIdentifier
to get a mtime, and make a WORM key. But do I really want this to
default to WORM?

Unsure what's the best path, so punting for now.
2020-07-03 14:02:50 -04:00
Joey Hess
ddcab38e4a
no importKey for S3 for now
The Etag is sometimes a md5, but not if eg, there was a multipart
upload.

May revisit later if there's demand.
2020-07-03 13:53:14 -04:00
Joey Hess
85506a7015
import: Added --no-content option, which avoids downloading files from a special remote
Only supported by some special remotes: directory
I need to check the rest and they're currently missing methods until I do.

git-annex sync --no-content does not yet use this to do imports
2020-07-03 13:41:57 -04:00
Joey Hess
0f26782a73
fix windows build more 2020-07-02 12:01:09 -04:00
Joey Hess
00497fd38e
fix windows build 2020-07-02 11:46:26 -04:00
Joey Hess
8ad433d5f0
fix windows build 2020-07-02 11:35:05 -04:00
Joey Hess
8b22e0bf37
lockContent for tahoe
Trivial since git-annex cannot remove, but do an active checkKey verification
anyway, in case the data was lost somehow.

This commit was sponsored by Ryan Newton on Patreon.
2020-06-26 14:23:21 -04:00
Joey Hess
3175015d1b
lockContent for S3 (with versioning=yes) and git-lfs
Made several special remotes support locking content on them while
dropping, which allows dropping from another special remote when the
content will only remain on a special remote of these types.

In both cases, verify the content is present actively, because it's
certianly possible for things other than git-annex to have removed it.

Worth thinking about what to do if at some later point, git-lfs gains
support for dropping content, and a content locking operation.
That would probably need a transition; first would need to make lockContent
use the locking operation. Then, once enough time had passed that we can
assume any git-annex operating on the git-lfs remote had that change,
git-annex could finally allow dropping from git-lfs.

Or, it could be that git-lfs gains support for dropping content, but not
locking it. In that case, it seems this commit would need to be reverted,
and then wait long enough for that git-annex to be everywhere, and only
then can git-annex safely support dropping from git-lfs.

So, the assumption made in this commit could lead to bother later.. But I
think it's actually highly unlikely git-lfs does ever support dropping;
it's outside their centralized model. Probably. :) Worth keeping in mind as
the same assumption is made about other special remotes though.

This commit was sponsored by Ethan Aubin.
2020-06-26 13:46:42 -04:00