Commit graph

40772 commits

Author SHA1 Message Date
Austin
d11ca305e3 2021-08-18 06:17:46 +00:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
ead94daca4 commit f0754a61f doesn't compile on Windows without a small patch 2021-08-17 21:15:04 +00:00
Joey Hess
f0754a61f5
plumb VerifyConfig into retrieveKeyFile
This fixes the recent reversion that annex.verify is not honored,
because retrieveChunks was passed RemoteVerify baser, but baser
did not have export/import set up.

Sponsored-by: Dartmouth College's DANDI project
2021-08-17 12:43:13 -04:00
Joey Hess
4bbc6a25fa
comment 2021-08-17 10:28:18 -04:00
yarikoptic
b80dd29d6a initial report about needing newer gpg to get tests pass 2021-08-16 21:58:07 +00:00
yarikoptic
08ca0d8961 Added a comment: note on sec= on the mount 2021-08-16 21:38:13 +00:00
Joey Hess
ffa1f6ed30
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-16 17:30:04 -04:00
Joey Hess
f8463ad52f
status update 2021-08-16 17:29:39 -04:00
Joey Hess
8613770b06
incremental verify for webdav special remote
Sponsored-by: Dartmouth College's DANDI project
2021-08-16 17:29:32 -04:00
Joey Hess
b1622eb932
incremental verify for directory special remote
Added fileRetriever', which will let the remaining special remotes
eventually also support incremental verify.

Sponsored-by: Dartmouth College's DANDI project
2021-08-16 16:51:33 -04:00
Joey Hess
a644f729ce
refactor fileCopier
Sponsored-by: Dartmouth College's DANDI project
2021-08-16 15:56:24 -04:00
Joey Hess
d889ae0c01
move comment 2021-08-16 15:25:06 -04:00
Joey Hess
ec82299730
status update
I was wrong about S3 supporting tailVerify.
2021-08-16 15:15:32 -04:00
Joey Hess
aac0654ff4
handle AlreadyInUseError
As happens when using the directory special remote, gitlfs, webdav, and
S3. But not external, adb, gcrypt, hook, or rsync.

Sponsored-by: Dartmouth College's DANDI project
2021-08-16 15:03:48 -04:00
Joey Hess
c4aba8e032
better handling of finishing up incomplete incremental verify
Now it's run in VerifyStage.

I thought about keeping the file handle open, and resuming reading where
tailVerify left off. But that risks leaking open file handles, until the
GC closes them, if the deferred verification does not get resumed. Since
that could perhaps happen if there's an exception somewhere, I decided
that was too unsafe.

Instead, re-open the file, seek, and resume.

Sponsored-by: Dartmouth College's DANDI project
2021-08-16 14:52:59 -04:00
Joey Hess
e0b7f391bd
improve tailVerify
Wait for the file to get modified, not only opened. This way, if a
remote does not support resuming, and opens a new file over top of the
existing file, it will wait until that remote starts writing, and open
the file it's writing to, not the old file.

Sponsored-by: Dartmouth College's DANDI project
2021-08-16 14:47:37 -04:00
yarikoptic
b66308a214 update description that it is isilon under 2021-08-16 17:44:01 +00:00
matthias.risze@9f2c8f7faed4cac1905d1bf1ee4524d708c13688
03ca13f5c5 Added a comment: type=git special remote cannot be enabled, no uuid is generated 2021-08-16 13:02:04 +00:00
https://christian.amsuess.com/chrysn
fa69431266 migrate script: Get full list of remotes that have a file; doc updates; progress output; corner case fixes 2021-08-15 19:01:37 +00:00
https://christian.amsuess.com/chrysn
905fef31b3 Added a comment: Another example 2021-08-15 17:42:54 +00:00
https://christian.amsuess.com/chrysn
29a1274f99 migrate script: Do whereis work before to speed up processing 2021-08-15 11:48:58 +00:00
https://christian.amsuess.com/chrysn
69fc2a22b3 Added a comment: annex-to-annex 2021-08-15 11:47:31 +00:00
https://christian.amsuess.com/chrysn
a5f620a1d9 migrate script: Fix accidentally commented-out fsck run 2021-08-15 11:04:37 +00:00
https://christian.amsuess.com/chrysn
088d6c4cd0 migrate script: Don't try on all remotes, look where it is and drop from there (faster) 2021-08-15 10:37:53 +00:00
spwhitton
9e801dc906 Added a comment: annex-to-annex 2021-08-13 22:08:57 +00:00
https://christian.amsuess.com/chrysn
a59637412c Convert previously missing attachment 2021-08-13 20:58:57 +00:00
https://christian.amsuess.com/chrysn
cb36451f26 Tool to drop migrated files for good 2021-08-13 20:53:48 +00:00
Joey Hess
e46a7dff6f
fix windows build 2021-08-13 16:36:33 -04:00
Joey Hess
037bf68269
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-13 16:35:20 -04:00
Joey Hess
751242b55e
status update 2021-08-13 16:34:18 -04:00
Joey Hess
16dd3dd4ca
catch more exceptions
I saw this:

  .git/annex/tmp/SHA256E-s1234376--5ba8e06e0163b217663907482bbed57684d7188024155ddc81da0710dfd2687d: openBinaryFile: resource busy (file is locked)

 guess catching IO exceptions did not catch that one.
2021-08-13 16:16:46 -04:00
Joey Hess
dadbb510f6
incremental hashing for fileRetriever
It uses tailVerify to hash the file while it's being written.

This is able to sometimes avoid a separate checksum step. Although
if the file gets written quickly enough, tailVerify may not see it
get created before the write finishes, and the checksum still happens.

Testing with the directory special remote, incremental checksumming did
not happen. But then I disabled the copy CoW probing, and it did work.
What's going on with that is the CoW probe creates an empty file on
failure, then deletes it, and then the file is created again. tailVerify
will open the first, empty file, and so fails to read the content that
gets written to the file that replaces it.

The directory special remote really ought to be able to avoid needing to
use tailVerify, and while other special remotes could do things that
cause similar problems, they probably don't. And if they do, it just
means the checksum doesn't get done incrementally.

Sponsored-by: Dartmouth College's DANDI project
2021-08-13 15:43:29 -04:00
Joey Hess
ff2dc5eb18
INotify.removeWatch can crash
Unsure why, possibly if the file has been replaced by another file.
2021-08-13 15:35:18 -04:00
Joey Hess
7503b8448b
inotify reports paths relative to directory being watched
Sponsored-by: Dartmouth College's DANDI project
2021-08-13 14:51:15 -04:00
Joey Hess
e07625df8a
convert tailVerify to not finalize the verification
Added failIncremental so it can force failure to verify.

Sponsored-by: Dartmouth College's DANDI project
2021-08-13 13:39:02 -04:00
Joey Hess
9d533b347f
tailVerify: return deferred action when it gets behind
Sponsored-by: Dartmouth College's DANDI project
2021-08-13 12:32:01 -04:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
41ef5da4e0 the fact that I needed a modification/patch to build mentioned 2021-08-13 03:42:10 +00:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
3dc6c7a9a0 prop_view_roundtrips fails (occasionally) 2021-08-13 03:31:45 +00:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
57884e5442 windows build fails as of 7550ef9a2 2021-08-13 02:17:50 +00:00
Joey Hess
7550ef9a2c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-12 14:50:12 -04:00
Joey Hess
51d59fb260
comment 2021-08-12 14:49:48 -04:00
Joey Hess
b6efba8139
add tailVerify
Not yet used, but this will let all remotes verify incrementally if it's
acceptable to pay the performance price. See comment for details of when
it will perform badly. I anticipate using this for all special remotes
that use fileRetriever. Except perhaps for a few like GitLFS that could
feed the incremental verifier themselves despite using that.

Sponsored-by: Dartmouth College's DANDI project
2021-08-12 14:38:02 -04:00
yarikoptic
6318c0f27f a report on the flood of failing tests on discovery 2021-08-11 20:25:51 +00:00
Joey Hess
2e54564061
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-11 14:51:05 -04:00
jasonb@ab4484d9961a46440958fa1a528e0fc435599057
285026eb91 Added a comment: I have this behavior consistently on the 2 repos I use 2021-08-11 18:49:41 +00:00
Joey Hess
7eb3742e4b
incremental verify for chunked remotes
Simply feed each chunk in turn to the incremental verifier.

When resuming an interrupted retrieve, it does not do incremental
verification. That would need to read the file, up to the resume point,
and feed it to the incremental verifier. That seems easy to get wrong.
Also it would mean extra work done before the transfer can start. Which
would complicate displaying progress, and would perhaps not appear to the
user as if it was resuming from where it left off. Instead, in that
situation, return UnVerified, and let the verification be done in a
separate pass.

Granted, Annex.CopyFile does manage all that, but it's not complicated
by dealing with chunks too.

Sponsored-by: Dartmouth College's DANDI project
2021-08-11 14:42:49 -04:00
Lukey
e134f411d4 Added a comment 2021-08-11 18:25:51 +00:00
Joey Hess
c20358b671
incremental verify for byteRetriever special remotes
Several special remotes verify content while it is being retrieved,
avoiding a separate checksum pass. They are: S3, bup, ddar, and
gcrypt (with a local repository).

Not done when using chunking, yet.

Complicated by Retriever needing to change to be polymorphic. Which in turn
meant RankNTypes is needed, and also needed some code changes. The
change in Remote.External does not change behavior at all but avoids
the type checking failing because of a "rigid, skolem type" which
"would escape its scope". So I refactored slightly to make the type
checker's job easier there.

Unfortunately, directory uses fileRetriever (except when chunked),
so it is not amoung the improved ones. Fixing that would need a way for
FileRetriever to return a Verification. But, since the file retrieved
may be encrypted or chunked, it would be extra work to always
incrementally checksum the file while retrieving it. Hm.

Some other special remotes use fileRetriever, and so don't get incremental
verification, but could be converted to byteRetriever later. One is
GitLFS, which uses downloadConduit, which writes to the file, so could
verify as it goes. Other special remotes like web could too, but don't
use Remote.Helper.Special and so will need to be addressed separately.

Sponsored-by: Dartmouth College's DANDI project
2021-08-11 14:20:38 -04:00
gabrielhidasy@c3d26e2c0b3e669d012f06736616088b42ad0dbe
b9a9273a87 2021-08-11 16:29:37 +00:00
Joey Hess
9518aca2f5
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-11 12:16:49 -04:00