Commit graph

1142 commits

Author SHA1 Message Date
Joey Hess
2dc20e3fa4
update design doc with final design choices 2019-04-09 13:05:22 -04:00
Joey Hess
06cbaa4233
fix back-compat with old git-annex
Unfortunately, "port" has to be set by default, or the old git-annex
will crash when trying to enable the S3 remote.

So, when protocol=https is specified, it needs to override port=80,
since it may be a default setting.
2019-03-22 12:27:41 -04:00
Joey Hess
2a99d7ffc0
improve error message 2019-03-22 12:23:59 -04:00
Joey Hess
7d37011a11
S3: Added protocol= initremote setting, to allow https to be used on a non-standard port
protocol=https implies port=443 and
port=443 implies protocol=https
-- this was necessary because the existing configs set port=443, but
with a protocol setting, users will naturally want to use it, and then
there's no need for them to supply the default https port. So we keep
back-compat, add a nicer way to enable https, and also add support for
non-standard https ports.
2019-03-22 12:17:05 -04:00
Joey Hess
40ecf58d4b
update licenses from GPL to AGPL
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.

Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.

(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
2019-03-13 15:48:14 -04:00
Joey Hess
2912429640
better indicate when special remotes do not support renameExport
Avoid a warning message when renameExport is not supported, and just
fallback to deleting with a subsequent re-upload. Especially needed for
importtree remotes, where renameExport needs to be disabled.

This changes the external special remote protocol, but in a
backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older
version of git-annex will cause it to make renameExport return False.
2019-03-11 12:53:24 -04:00
Joey Hess
e412129523
concurrency and status messages when downloading from import 2019-03-08 12:33:44 -04:00
Joey Hess
ee5f1422df
remove debug print 2019-03-07 16:08:58 -04:00
Joey Hess
9a72785307
fixes to export db lookup when accessing importtree=yes
Now in a fresh clone with a importtree=yes remote enabled,
git annex fsck --from the remote works.
2019-03-07 14:10:56 -04:00
Joey Hess
93025dd59f
add missing locking of ContentIdentifier database when writing
This is not super efficient; it would be better to lock the database
once and build up a queue of changes and flush once.

But, storeExportWithContentIdentifier is likely going to be the really
expensive part, so let's do the simple thing and only optimise later if
needed.
2019-03-07 13:32:33 -04:00
Joey Hess
3f449f845e
update 2019-03-07 13:28:18 -04:00
Joey Hess
b3d30e7d70
remove unncessary locking of ContentIdentifier db
Remote.Helper.ExportImport only reads from it, and locking is only
needed when writing.
2019-03-06 14:36:57 -04:00
Joey Hess
b23c301820
fix false positive from checkPresentExportWithContentIdentifierM when file does not exist 2019-03-05 17:04:00 -04:00
Joey Hess
dc278c059c
fix STM crash
git-annex: thread blocked indefinitely in an STM transaction
failed

git-annex: sqlite query crashed
CallStack (from HasCallStack):
  error, called at ./Database/Handle.hs:98:42 in main:Database.Handle
failed

This needs further investigation.
2019-03-05 16:37:40 -04:00
Joey Hess
46d33e804a
added checkPresentExportWithContentIdentifier
Ugh, don't like needing to add this, but I can't see a way around it.
2019-03-05 16:03:03 -04:00
Joey Hess
354aafce1a
refactor database handle code
Use same, simpler method to make only one thread open the export db as
is used for the ContentIdentifier db.

And, always update the export db once before using.
2019-03-05 15:42:39 -04:00
Joey Hess
fd2a1aaa17
avoid using renameExport on import remotes 2019-03-05 14:57:48 -04:00
Joey Hess
bec66258a8
minor 2019-03-05 14:50:39 -04:00
Joey Hess
8c54604e67
import+export from directory special remote fully working
Had to add two more API calls to override export APIs that are not safe
for use in combination with import.

It's unfortunate that removeExportDirectory is documented to be allowed
to remove non-empty directories. I'm not entirely sure why it's that
way, my best guess is it was intended to make it easy to implement with
just rm -rf.
2019-03-05 14:20:14 -04:00
Joey Hess
554b7b7f3e
fix todo 2019-03-04 18:20:12 -04:00
Joey Hess
bc509143e5
avoid opening export db until needed
Before, it was opened when constructing the export Remote, even if it
never got used.
2019-03-04 18:11:32 -04:00
Joey Hess
cd3a2b023a
initial try at using storeExportWithContentIdentifier
Untested, and I'm not sure about the locking of the ContentIdentifier db.
2019-03-04 17:50:41 -04:00
Joey Hess
aaacf431d8
handle importtree=yes config
For now, it's only allowed when exporttree=yes is also set.
That simplified the implementation, but could later be changed if
there's a remote that makes sense to be an import but not an export.
However, it may work just as well to make a remote be readonly to
prevent export to it while still allowing import.
2019-03-04 16:07:35 -04:00
Joey Hess
88ccfaa78c
storeExportWithContentIdentifierM for directory special remote
Not sure if my reasoning about the races really holds.

It would certianly be possible to better guard against races by using
Linux-specific renameat2 with RENAME_EXCHANGE or RENAME_NOREPLACE.

Or by using link and relying on it not overwriting existing files -- but
that would need a filesystem that supports hard links and directory can
be used in filesystems that don't.
2019-03-04 14:46:25 -04:00
Joey Hess
3cd19fb4d0
use InodeCache to avoid races in import from directory special remote
This does not avoid all possible races, but it does avoid all likely
ones, and is demonstratably better than git's own handling of races
where files get modified at the same time as it's updating the working
tree.

The main thing this won't detect are not unlikely races where part
of a file gets changed while it's being copied and then the file is
restored to its original condition before the modification check.
No, it's more likely that the limitations of checking inode, size,
and mtime won't detect certian modifications, involving eg mmapped
files.
2019-03-04 13:57:23 -04:00
Joey Hess
e2e57f8556
initial export support for directory special remote
This does not guard against race condition yet, it's only for testing
purposes.
2019-02-27 13:42:34 -04:00
Joey Hess
45aacd888b
import downloader complete (untested)
Made some api changes.

listImportableContents needs to provide the size
of the data, so the downloader can check disk free space.

retrieveExportWithContentIdentifier is passed the filepath to write to

Use temporary "CID" key during download of a ContentIdentifier from a
remote, so withTmp can be used and then move the content to the real key
once it's known.
2019-02-27 13:15:02 -04:00
Joey Hess
760f26ebc6
Merge branch 'master' into importtree 2019-02-26 11:36:36 -04:00
Joey Hess
19f833b0b1
aws-0.21.1
* S3: Support enabling bucket versioning when built with aws-0.21.1.
* stack.yaml: Build with aws-0.21.1
2019-02-24 12:45:09 -04:00
Joey Hess
fd304dce60
split out Types.Import and some changes to the types in it 2019-02-21 13:39:09 -04:00
Joey Hess
ccc0684d21
no remotes support import yet 2019-02-20 16:59:04 -04:00
Joey Hess
e49f3139b5
fix windows build some more 2019-02-18 17:46:21 -04:00
Joey Hess
4c3178aadf
fix windows build 2019-02-18 17:38:21 -04:00
Joey Hess
9f6b7d6258
On Windows, avoid using rsync for file-to-file copies, since rsync is not always available there.
Installing git-annex with stack rsync won't be available.
Also, using the git-annex installer with 64 bit git installs a non-working
rsync binary because it's linked with libraries provided by 32 bit git.
2019-02-18 17:27:34 -04:00
Joey Hess
60c1b5c994
deal with attempt to export filename with # or ? to webdav
xporting files with '#' or '?' in their name won't work because urls get
truncated on those. Fail in a better way in this case, and avoid failing
when removing such files from the export, so after the user has renamed the
problem files the export will succeed.
2019-02-07 13:47:57 -04:00
Joey Hess
7b9701675e
Display progress bar when getting files from export remotes
And moved the progress bar display into storeExport as well.

This commit was sponsored by John Pellman on Patreon.
2019-01-31 13:34:12 -04:00
Joey Hess
ab689cf0cd
Improved speed of S3 remote by only loading S3 creds once
This gets back any speed lost in commit
9cebfd7002, and speeds up all uses of S3
remotes that operate on them more than once.

This commit was sponsored by Brett Eisenberg on Patreon.
2019-01-30 16:20:14 -04:00
Joey Hess
8eb66a5c40
avoid potentually unsafe use of runResourceT
Pushed the ResourceT out into larger code blocks, and made sure that
the the http result from a sendS3Handle is processed inside the same
ResourceT block.

I don't think this fixes any bugs, but it allows getting rid of a scary
comment.

This commit was sponsored by Eric Drechsel on Patreon.
2019-01-30 15:40:13 -04:00
Joey Hess
9cebfd7002
purify exportActions
Purifying exportActions will allow introspecting and modifying it,
which is needed to add progress bar display to it.

Only S3 and WebDAV ran an Annex action while constructing ExportActions.
There was a small performance gain from them doing that, since a
resource was able to be prepared and reused for multiple actions by
Command.Export.

As seen in commit 809cfbbd8a and
5d394023eb S3 and WebDAV actually create a
new handle for each access in normal, non-export use. It doesn't seem
worth making export use of them marginally more efficient than normal
use. It would be better to do that work upfront when constructing the
remote. Or perhaps use a MVar to cache a handle.

This commit was sponsored by Nick Piper on Patreon.
2019-01-30 15:11:40 -04:00
Joey Hess
5d394023eb
remove incorrect comment
resourcePrepare does not cause the resource to only be prepared once.

The http manager should be reused, which does avoid http connection
overhead, but not because of the use of resourcePrepare.
2019-01-30 14:38:35 -04:00
Joey Hess
809cfbbd8a
prepareS3Handle didn't give any benefits, so remove
I seem to have thought that a Preparer was only run once when a remote
is accessed multiple times, but that is not in fact the case. prepareS3Handle
is run once per access. So, there is no point to it.

That there is some duplicate work done on each access is now apparent.
Luckily, the http manager is reused, so only one http connection is
made. But the S3 creds are loaded repeatedly. Room for improvement here.

This commit was sponsored by Jack Hill on Patreon.
2019-01-30 14:23:39 -04:00
Joey Hess
720e5fda5c
export retrieval fallback to handle S3 remote with partially missing version IDs
When key-based retrieval from a S3 remote with exporttree=yes
appendonly=yes fails, fall back to trying to retrieve from the exported
tree. This allows downloads of files that were exported to such a remote
before versioning was enabled on it.

This is useful at least for a transition for users who got into that
situation, so they can download content from their S3 remote. May want to
remove this in the future though, since normally trying to download the
second time is only extra work.

This commit was sponsored by Brock Spratlen on Patreon.
2019-01-30 13:23:03 -04:00
Joey Hess
ad1d422dd7
fix false positive in export conflict detection
Like the earlier fixed one in Command.Export, it occurred when the same
tree was exported by multiple clones. Previous fix was incomplete since
several other places looked at the list of exported trees to detect when
there was an export conflict. Added a single unified function to avoid
missing any places it needed to be fixed.

This commit was sponsored by mo on Patreon.
2019-01-30 12:36:30 -04:00
Joey Hess
8fc6c11cf1
couple fixes 2019-01-29 15:20:22 -04:00
Joey Hess
a8f1add4d1
S3: Detect when version=yes but an exported file lacks versioning, and refuse to delete it, to avoid data loss.
This commit was sponsored by Denis Dzyubenko on Patreon.
2019-01-29 15:07:27 -04:00
Joey Hess
bb9817ceae
enableremote S3: Do not let versioning=yes be set on existing remote
Because when git-annex lacks S3 version IDs for files stored in the bucket,
deleting them would cause data loss.

Also because git-annex is not able to download unversioned objects from a bucket
when versioning=yes.

This also prevents setting versioning=no. While that would perhaps be
possible to do safely, it would add complexity, and would mean that if
the user accidentially did enableremote versioning=no, they would not be
able to undo it.

This commit was sponsored by Trenton Cronholm on Patreon.
2019-01-29 14:09:50 -04:00
Joey Hess
ee011b3cbb
initremote S3: Automatically enable versioning in S3 buckets when configured with versioning=yes.
Needs not yet released version 0.22 of aws library; with older versions
asks the user to configure the bucket versioning themselves.

Note that S3 endpoints that don't support versioning will cause putBucketVersioning
to throw an exception, so initremote will fail.

This commit was sponsored by Jake Vosloo on Patreon.
2019-01-29 13:46:04 -04:00
Joey Hess
c4977ec1ff
refactoring 2019-01-29 13:42:32 -04:00
Joey Hess
669b305de2
S3: Send a Content-Type header when storing objects in S3
So exports to public buckets can be linked to from web pages.

(When git-annex is built with MagicMime support.)

Thanks to Jared Cosulich for the idea.
2019-01-23 13:08:47 -04:00
Joey Hess
d5f2463702
misctmp cleanup
* Switch to using .git/annex/othertmp for tmp files other than partial
  downloads, and make stale files left in that directory when git-annex
  is interrupted be cleaned up promptly by subsequent git-annex processes.
* The .git/annex/misctmp directory is no longer used and git-annex will
  delete anything lingering in there after it's 1 week old.

Also, in Annex.Ingest, made the filename it uses in the tmp dir be
prefixed with "ingest-" to avoid potentially using a filename used by
some other code.
2019-01-17 16:02:22 -04:00
Joey Hess
96aba8eff7
Revert "cache the serialization of a Key"
This reverts commit 4536c93bb2.

That broke Read/Show of a Key, and unfortunately Key is read in at least
one place; the GitAnnexDistribution data type.

It would be worth bringing this optimisation back, but it would need
either a custom Read/Show instance that preserves back-compat, or
wrapping Key in a data type that contains the serialization, or changing
how GitAnnexDistribution is serialized.

Also, the Eq instance would need to compare keys with and without a
cached seralization the same.
2019-01-16 16:21:59 -04:00
Joey Hess
4536c93bb2
cache the serialization of a Key
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.

It means that every place a Key has any of its fields changed, the cache
has to be dropped. I've grepped and found them all. But, it would be
better to avoid that gotcha somehow..
2019-01-14 16:37:28 -04:00
Joey Hess
d3ab5e626b
rename key2file and file2key
What these generate is not really suitable to be used as a filename,
which is why keyFile and fileKey further escape it. These are just
serializing Keys.

Also removed a quickcheck test that was very unlikely to test anything
useful, since it relied on random chance creating something that looks
like a serialized key. The other test is sufficient for testing what
that was intended to test anyway.
2019-01-14 13:03:35 -04:00
Joey Hess
727767e1e2
make everything build again after ByteString Key changes 2019-01-11 16:39:46 -04:00
Joey Hess
cb375977a6
follow-on changes from MetaData type changes
Including writing and parsing the metadata log files with
bytestring-builder and attoparsec.
2019-01-07 15:51:05 -04:00
Joey Hess
7d51b0c109
import Utility.FileSystemEncoding in Common 2019-01-03 11:37:02 -04:00
Joey Hess
b3c69eaaf8
strict bytestring encoders and decoders
Only had lazy ones before.

Already sped up a few parts of the code.
2019-01-01 14:55:15 -04:00
Joey Hess
9cc6d5549b
convert UUID from String to ByteString
This should make == comparison of UUIDs somewhat faster, and perhaps a
few other operations around maps of UUIDs etc.

FromUUID/ToUUID are used to convert String, which is still used for all
IO of UUIDs. Eventually the hope is those instances can be removed,
and all git-annex branch log files etc use ByteString throughout, for a
real speed improvement.

Note the use of fromRawFilePath / toRawFilePath -- while a UUID usually
contains only alphanumerics and so could be treated as ascii, it's
conceivable that some git-annex repository has been initialized using
a UUID that is not only not a canonical UUID, but contains high unicode
or invalid unicode. Using the filesystem encoding avoids any problems
with such a thing. However, a NUL in a UUID seems extremely unlikely,
so I didn't use encodeBS / decodeBS to avoid their extra overhead in
handling NULs.

The Read/Show instance for UUID luckily serializes the same way for
ByteString as it did for String.
2019-01-01 14:45:33 -04:00
Joey Hess
2e069eb9f6
use putBucket to future-proof
New fields can be added to PutBucket in the future.
2018-12-31 13:09:20 -04:00
Joey Hess
3fdc6fdfa9
remove unused import 2018-12-30 15:18:49 -04:00
Joey Hess
a26514d67e
Fix doubled progress display when downloading an url when -J is used.
downloadUrl uses meteredFile, which sets up one progress meter,
and Remote.Web also uses metered, so two progress meters are displayed for
the same download.

Reversion introduced with the http-conduit switch in
c34152777b -- I don't know why the extra
call to metered was added there.

When -J is not used, the extra progress meter didn't display,
but an extra blank line did get output, which is also fixed.

This commit was sponsored by John Pellman on Patreon.
2018-12-30 12:29:49 -04:00
Joey Hess
3f587d447a
fix webdav reversion
webdav: When initializing, avoid trying to make a directory at the top of
the webdav server, which could never accomplish anything and failed on
nextcloud servers. (Reversion introduced in version 6.20170925.)

This commit was sponsored by mo on patreon.
2018-12-10 12:49:51 -04:00
Joey Hess
4579dd6201
S3: Improve diagnostics when a remote is configured with exporttree and versioning, but no S3 version id has been recorded for a key.
When public access is used for the remote, it complained that the user
needed to set creds to use it, which was just wrong.

When creds were being used, it fell back from trying to use the version ID
to just accessing the key in the bucket, which was ok for non-export
remotes, but wrong for buckets.

In both cases, display a hopefully useful warning.

This should only come up when an existing S3 remote has been exported
to, and then later versioning was enabled.

Note that it would perhaps be possible to fall back from trying to use
retrieveKeyFile when it fails and instead use retrieveKeyFileFromExport,
which may work when S3 version ID is missing. But there are problems
with that approach; how to tell when retrieveKeyFile has failed due to this
rather than a network problem etc? Anyway, that approach would only work
until the file in the export got overwritten, and then it would no
longer be accessible. And with versioning enabled, the user wants old
versions of objects to remain accessible, so it seems better to warn
about the problem as soon as possible, so they can go back and add S3
version IDs.

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-12-06 13:44:37 -04:00
Joey Hess
1308a76bf1
deMaybe credPairRemoteKey
It's always Just
2018-12-04 13:37:43 -04:00
Joey Hess
a25fef36ad
fix json for exportedtrees in conflict
Repeating the same json field with multiple values tends to not be
supported well by json parsers, so list the trees separated by spaces.
2018-12-03 14:43:59 -04:00
Joey Hess
b8f9dea27d
add exportedtree to info
info: When used with an exporttree remote, includes an "exportedtree" info,
which is the tree last exported to the remote. During an export conflict,
multiple values will be listed.

This commit was sponsored by John Pellman on Patreon.
2018-12-03 14:36:00 -04:00
Robert Schütz
32017f8082
replace TORRENT by WITH_TORRENTPARSER 2018-11-27 12:29:25 -04:00
Joey Hess
370757087d
catch lockContentForRemoval exception
removeKey should not throw exceptions, so catch exception there

In Assistant.Unused, keep trying to drop other keys if one drop fails
2018-11-15 15:39:57 -04:00
Joey Hess
d65df7ab21
improve messages around export conflicts
When an export conflict prevents accessing a special remote, be clearer
about what the problem is and how to resolve it.

This commit was sponsored by Trenton Cronholm on Patreon.
2018-11-13 15:50:06 -04:00
Joey Hess
c472c268c4
webapp: Fixed a crash when adding a git remote.
Reversion introduced in 2b66492d6e which added a new cache that needs to be
cleared.
2018-10-29 16:01:08 -04:00
Joey Hess
a622488758
remove CHECKURL-MULTI single url response special case
Removed undocumented special case in handling of a CHECKURL-MULTI response
with only a single file listed. Rather than ignoring the url that was in
the response, use it. This allows external special remotes that want to
provide some better url to do so, although I don't entirely agree with
using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an
undocumented special case that throws data away is just not a good idea.

This could in theory break some external special remote program that relied
on the current behavior, but its seems unlikely that it would because such
a program must already handle the multiple url case, unless it only ever
provides a single url response to CHECKURL-MULTI.

Make addurl --file work with a single item CHECKURL-MULTI response.
It already did for external special remotes due to the special case,
but now it also will for builtin ones like the BitTorrent special remote.

This commit was sponsored by Ilya Shlyakhter on Patron.
2018-10-29 14:52:12 -04:00
Joey Hess
fcca7adaff
instrument P2P --debug with connection and thread info
For debugging http://git-annex.branchable.com/bugs/annex_get_-J_16_via_ssh_stalls_/

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-10-22 15:52:11 -04:00
Joey Hess
c24e255de1
Fix concurrency bug that occurred on the first download from an exporttree remote
Block other threads while the export database is being constructed (or
updated) by the first thread to try to access it.

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-10-22 12:59:10 -04:00
Joey Hess
a9dd087074
centralized "yes"/"no" parsing
This commit was sponsored by Jack Hill on Patreon.
2018-10-10 11:14:27 -04:00
Joey Hess
6f0d8870df
Fix crash when exporttree is set to a bad value.
Made it impossible to recover from setting a bad value since enableremote
to change it would crash.

This commit was sponsored by Henrik Riomar on Patreon.
2018-10-10 10:44:54 -04:00
Joey Hess
451171b7c1
clean up url removal presence update
* rmurl: Fix a case where removing the last url left git-annex thinking
  content was still present in the web special remote.
* SETURLPRESENT, SETURIPRESENT, SETURLMISSING, and SETURIMISSING
  used to update the presence information of the external special remote
  that called them; this was not documented behavior and is no longer done.

Done by making setUrlPresent and setUrlMissing only update presence info
for the web, and only when the url is a web url. See the comment for
reasoning about why that's the right thing to do.

In AddUrl, had to make it update location tracking, to handle the
non-web-url case.

This commit was sponsored by Ewen McNeill on Patreon.
2018-10-04 17:35:49 -04:00
Joey Hess
303d10cee6
Improve display when git config download from a http remote fails.
The error message displayed used to only come from curl/wget and perhaps
was clearer than the one displayed now that http-client is used. In any
case, it does make sense to hide it because git-annex prints its own
warning message.

This commit was sponsored by Jake Vosloo on Patreon.
2018-10-03 12:31:09 -04:00
Joey Hess
6134431254
clean P2P protocol shutdown on EOF try 2
Same goal as b18fb1e343 but without
breaking backwards compatability. Just return IO exceptions when running
the P2P protocol, so that git-annex-shell can detect eof and avoid the
ugly message.

This commit was sponsored by Ethan Aubin.
2018-09-25 16:49:59 -04:00
Joey Hess
bc31b93c77
remote.name.annex-security-allow-unverified-downloads
Added remote.name.annex-security-allow-unverified-downloads, a per-remote
setting for annex.security.allow-unverified-downloads.

This commit was sponsored by Brock Spratlen on Patreon.
2018-09-25 15:34:47 -04:00
Joey Hess
773084c49b
S3: Fix url construction bug
When the publicurl has been set to an url that does not end with a slash,
we need to add one in between it and the rest of the url.

As far as I can see, git-annex does not default to such publicurls; it's
careful to end them with slashes. But this was observed in the wild, and
there may be documentation that doesn't include the slash. And it's an easy
mistake to make in any case.

This commit was sponsored by Eric Drechsel on Patreon.
2018-09-14 12:25:23 -04:00
Joey Hess
677038199c
fix build with older aws
S3: Multipart uploads are now only supported when git-annex is built
with aws-0.16.0 or later, as earlier versions of the library don't
support versioning with multipart uploads.

This will affect the android build, and debian stable also has a too old
aws to support both features at the same time.

This commit was sponsored by Nick Piper on Patreon.
2018-09-13 09:58:39 -04:00
Joey Hess
445ea66732
simplify 2018-09-06 16:07:16 -04:00
Joey Hess
b7daf2685f
support public versioned S3 access
Makes git annex whereis display the versionId urls.

And, when a s3 remote is enabled without creds, git-annex will use the
versionId urls to access its contents.

This commit was sponsored by Fernando Jimenez on Patreon.
2018-09-06 14:31:41 -04:00
Joey Hess
7407a80c27
S3: Support AWS_SESSION_TOKEN
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2018-09-05 15:53:57 -04:00
Joey Hess
53d839d543
more efficient encoding 2018-08-31 13:49:08 -04:00
Joey Hess
b3d42283ad
use per-remote metadata storage for S3 version ID
Since the same key can be stored in a versioned S3 bucket multiple times
with different version IDs, this allows tracking them all. Not currently
needed, but if we ever want to drop from a versioned S3 bucket, we'll
need to know them all.

This commit was supported by the NSF-funded DataLad project.
2018-08-31 13:27:29 -04:00
Joey Hess
6b75b9c448
turn on appendonly when versioning is enabled 2018-08-31 10:53:07 -04:00
Joey Hess
19dcff2b71
use S3 version ID for retrieval
Have to store the S3 object along with the version ID, so retrieval can
use the same object.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 15:37:08 -04:00
Joey Hess
794e9a7a44
store S3 version IDs
Only done when versioning=yes is configured. It could always do it when
S3 sends back a version id, but there may be buckets that have
versioning enabled by accident, so it seemed better to honor the
configuration.

S3's docs say version IDs are "randomly generated", so presumably
storing the same content twice gets two different ones not the same one.
So I considered storing a list of version IDs for a key. That would
allow removing the key completely. But.. The way Logs.RemoteState works,
when there are multiple writers, the last writer wins. So storing a list
would need a different log format that merges, which seemed overkill to support
removing a key from an append-only remote.

Note that Logs.RemoteState for S3 is now dedicated to version IDs.
If something else needs to be stored, a new log will be needed to do it.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 14:30:56 -04:00
Joey Hess
0ff5a41311
S3 versioning=yes config
Not yet used.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 13:45:28 -04:00
Joey Hess
358178fbfb
don't untrust appendonly exports
Make exporttree=yes remotes that are appendonly not be untrusted, and not force
verification of content, since the usual concerns about losing data when an
export is updated by someone else don't apply.

Note that all the remote operations on keys are left as usual for
appendonly export remotes, except for storing content.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 11:48:04 -04:00
Joey Hess
8b39db20b5
export appendonly support
Make `git annex export` check appendonly when removing a file from an
export, and not update the location log, since the remote still contains
the content.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 11:18:20 -04:00
Joey Hess
02630b39ee
add Remote.readonly
Does nothing yet.

Considered making bup readonly, but while the content can't be removed,
it is able to delete a branch, so didn't.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 11:12:18 -04:00
Joey Hess
ae11394efa
added annex.commitmessage
Added annex.commitmessage config that can specify a commit message for the
git-annex branch instead of the usual "update".

This commit was supported by the NSF-funded DataLad project.
2018-08-02 14:06:06 -04:00
Joey Hess
2884637cab
S3: Support credential-less download from remotes configured with public=yes exporttree=yes.
This commit was supported by the NSF-funded DataLad project.
2018-07-31 16:32:43 -04:00
Joey Hess
9f3a346f25
fix nested exception bug
Fix reversion introduced in version 6.20180316 that caused git-annex to
stop processing files when unable to contact a ssh remote.

The bug was not in any of the changed lines, but this one in inAnnex:

P2PHelper.checkpresent (Ssh.runProto rmt connpool (cantCheck rmt) fallback) key

cantCheck throws an exception, but that parameter to runProto expects a
value, which it returns. So, inAnnex is returning a Bool containing an
exception. This defeats the usual checks for checkPresent throwing an
exception, crashing git-annex.

Fixed by making runProto take an `Annex a` instead of an `a`, so
passing cantCheck to it doesn't nest exceptions.

This commit was sponsored by andrea rota.
2018-07-03 13:10:43 -04:00
Joey Hess
dc6cb6aa5f
Merge branch 'later' 2018-06-25 21:59:20 -04:00
Joey Hess
a5228ac765
Support configuring remote.web.annex-cost and remote.bittorrent.annex-cost
Seems that has never worked before due to oversight.
2018-06-24 17:31:22 -04:00
Joey Hess
05ecee0db4
set ddar to RetrievalAllKeysSecure
Based on information from Robie Basak.
2018-06-21 16:38:47 -04:00
Joey Hess
f1b29dbeb4
don't assume boto will remain secure
On second thought, best to default to being secure even if boto changes
http libraries to one that happens to follow redirects.
2018-06-21 14:14:56 -04:00