Commit graph

44109 commits

Author SHA1 Message Date
Joey Hess
76e11e4458
Merge branch 'master' into distributedmigration 2023-12-08 14:18:23 -04:00
Joey Hess
257f01729c
distributed migration for pull and sync --content
pull, sync: When operating on content, automatically hard link objects
that have been migrated.

Added annex.syncmigrations config that can be set to false to prevent
pull and sync from migrating object content.

I think that true is a good default for this config, because it avoids
users having to re-download migrated content or learning about migration.
But, some users will surely not like it, whether because it does take some
time (especially for the first git-annex branch scan when there is a long
history), or because they want to deal with it manually, or because their
filesystem doesn't support hard links and they don't want it to copy
objects.

Sponsored-by: k0ld on Patreon
2023-12-08 14:18:18 -04:00
Joey Hess
4ed71b34de
migrate --apply
And avoid migrate --update/--aply migrating when the new key was already
present in the repository, and got dropped. Luckily, the location log
allows distinguishing from the new key never having been present!

That is mostly useful for --apply because otherwise dropped files would
keep coming back until the old objects were reaped as unused. But it
seemed to make sense to also do it for --update. for consistency in edge
cases if nothing else. One case where --update can use it is when one
branch got migrated earlier, and we dropped the file, and now another
branch has migrated the same file.

Sponsored-by: Jack Hill on Patreon
2023-12-08 13:23:46 -04:00
Joey Hess
51b974d9f0
skip distributed migration to insecure key when annex.securehashesonly is set
This only avoids extra work and a warning messsage. It seems likely that
in such a situation, the user does not want migrations to insecure
hashes, and so best to ignore them as much as possible. If
the user merges a branch that switches annexed files to an insecure
hash, they will notice that the file contents are unavailable,
and git-annex get will tell them the problem then. So it does not seem
useful to have migrate --update also complain about it.
2023-12-08 12:41:50 -04:00
Joey Hess
b65379a107
fix missing space in warning message 2023-12-08 12:36:33 -04:00
Joey Hess
30c2728d65
always verify content in distributed migration
doc/todo/distributed_migration.mdwn discusses security of distributed
migration, and this was identified as necessary to do.
2023-12-07 20:05:42 -04:00
Joey Hess
62ce56c4ea
display filenames in migrate --update
Have to go to a lot of bother to find them, but I think it's worth it
for usability.

Sponsored-by: Luke T. Shumaker on Patreon
2023-12-07 18:00:09 -04:00
Joey Hess
abea01d9e0
migrate --update fully working
Could use some more testing.

When the old key is not present, Command.ReKey.linkKey' will return
False, so this handles that case ok.

But, I do wonder if distributed migration may need to deal with the old
key getting copied into the repository later. In that situation,
re-running migrate --update won't link it to the new key. It may be that
some users will need that. They can delete .git/annex/migrate.log and
run it again, but that is not a good user interface. Maybe either have
a way to re-run all distributed migrations, or record migrations
in a database and scan the db to find migrations to do in a future run?

Sponsored-by: Kevin Mueller on Patreon
2023-12-07 17:27:51 -04:00
Joey Hess
7c7c9912c1
migrate --update gets keys
The git log is outputting the diff, but this only looks at the new
files. When we have a new file, we can get the old filename by just
replacing "new" with "old". And then use branchFileRef to refer to it
allows catting the old key.

While this does have to skip past the old files in the diff, it's still
faster than calling git diff separately.

Sponsored-by: Nicholas Golder-Manning on Patreon
2023-12-07 17:25:56 -04:00
Joey Hess
f1ce15036f
started migrate --update
This is most of the way there, but not quite working.

The layout of migrate.tree/ needs to be changed to follow this approach.
git log will list all the files in tree order, so the new layout needs
to alternate old and new keys. Can that be done? git may not document
tree order, or may not preserve it here.

Alternatively, change to using git log --format=raw and extract
the tree header from that, then use
git diff --raw $tree:migrate.tree/old $tree:migrate.tree/new
That will be a little more expensive, but only when there are lots of
migrations.

Sponsored-by: Joshua Antonishen on Patreon
2023-12-07 15:50:52 -04:00
Joey Hess
d06aee7ce0
make commitMigration interuption safe
Fixed inversion of control issue, so the tree is recorded
in streamLogFile finalizer.

Sponsored-by: Leon Schuermann on Patreon
2023-12-06 16:29:58 -04:00
Joey Hess
adc95a871d
comment 2023-12-06 15:42:40 -04:00
Joey Hess
0bd8b17b59
log migration trees to git-annex branch
This will allow distributed migration: Start a migration in one clone of
a repo, and then update other clones.

commitMigration is a bit of a bear.. There is some inversion of control
that needs some TMVars. Also streamLogFile's finalizer does not handle
recording the trees, so an interrupt at just the wrong time can cause
migration.log to be emptied but the git-annex branch not updated.

Sponsored-by: Graham Spencer on Patreon
2023-12-06 15:40:03 -04:00
Joey Hess
b55efc179a
add startAction parameter for KeySha
I have a use planned for this in Command.Migrate.

Sponsored-by: unqueued on Patreon
2023-12-06 13:28:02 -04:00
Joey Hess
1f811c340d
kinda a bug 2023-12-05 16:43:14 -04:00
Joey Hess
b4cd985a3e
remove xmpp from special remotes list
It's documentation for something that was removed, so avoid it getting
copied into eg, nice talks about git-annex. ;-)
2023-12-05 16:30:47 -04:00
Joey Hess
1a586f80e6
remove debug print 2023-12-05 15:56:58 -04:00
Joey Hess
10964f91bc
further thoughts 2023-12-05 15:00:22 -04:00
Joey Hess
ede36eeb86
Merge branch 'master' of ssh://git-annex.branchable.com 2023-12-05 13:38:01 -04:00
Joey Hess
68ea9d5a25
comment 2023-12-05 13:37:34 -04:00
nobodyinperson
2efef85bd0 Add link to English re-recording of Yann's git-annex workshop kickoff talk @Tübix2023 2023-12-05 17:18:50 +00:00
Joey Hess
63f940f591
Revert "update"
This reverts commit 6f4e3cc881.
2023-12-05 12:39:33 -04:00
Joey Hess
6f4e3cc881
update 2023-12-05 12:39:17 -04:00
Joey Hess
a6eb7d7339
prevent relatedTemplate from truncating a filename to end in "."
Avoid a problem with temp file names ending in "." on certian filesystems
that have problems with such filenames.

relatedTemplate is quite an ugly hack really; since it doesn't know the max
filename length of the filesystem it can only assume that the filename is
max allowed length. When given the input "lh.aparc.DKTatlas.annot", it
wants to reserve 20 characters for tempfile so it truncates to "lh.". That
ending period is apparently a problem on some filesystem (FAT eats it, but
does not throw EINVAL; ntfs does not seem bothered by it, I don't know what
FUSE filesystem the bug reporter was really using).

Sponsored-by: Brett Eisenberg on Patreon
2023-12-05 12:38:14 -04:00
Joey Hess
9aa53212a9
Merge branch 'master' of ssh://git-annex.branchable.com 2023-12-05 12:10:46 -04:00
cjmarkie
545f3873ca No change. Just subscribing to comments. 2023-12-05 16:04:46 +00:00
cjmarkie
f6f4ba3c6c 2023-12-05 14:54:11 +00:00
https://esgf-node.llnl.gov/esgf-idp/openid/mvhulten
9a9d99efeb 2023-12-05 14:10:07 +00:00
https://esgf-node.llnl.gov/esgf-idp/openid/mvhulten
5d22ccc584 rename forum/name_resolution_of___33__dne__33___fails.mdwn to forum/name_resolution_of_dne.mdwn 2023-12-05 14:08:52 +00:00
https://esgf-node.llnl.gov/esgf-idp/openid/mvhulten
8c55d65987 rename forum/dne.mdwn to forum/name_resolution_of___33__dne__33___fails.mdwn 2023-12-05 14:07:57 +00:00
https://esgf-node.llnl.gov/esgf-idp/openid/mvhulten
8ee111cd8e 2023-12-05 14:05:08 +00:00
brendan.ward@a2e11ad27f6b2fa2c556aea6811496e0d95dd0da
b34d9b1405 Added a comment 2023-12-05 03:24:29 +00:00
kolam
3e6dba097e 2023-12-04 19:32:28 +00:00
Joey Hess
ecebb00a23
Merge branch 'master' of ssh://git-annex.branchable.com 2023-12-04 13:52:59 -04:00
Joey Hess
383c9833a3
comment 2023-12-04 13:52:51 -04:00
nobodyinperson
9906e8fd4c Added a comment: How about a --offline flag? 2023-12-04 17:52:44 +00:00
Joey Hess
0485dd3161
sync: Fix locking problems during merge when annex.pidlock is set
Presumably git merge sometimes needs to verifiy if a worktree file is
modified, and so will then run git-annex filter-process which would try to
take the pid lock. And for whatever reason, git-annex sync already had the
pidlock held. I have not replicated that, but it does make enough sense to
deploy the workaround.

Like I said back in commit 7bdb0cdc0d,

   Arguably, it would be better to have a way to make any process git-annex
   runs have the env var set. But then it would need to take the pid lock
   when running any and all processes, and that would be a problem when
   git-annex runs two processes concurrently. So, I'm left doing it ad-hoc
   in places where git-annex really does run a child process, directly
   or indirectly via a particular git command.

Sponsored-by: KDM on Patreon
2023-12-04 13:40:28 -04:00
Joey Hess
37ff9b6401
comment 2023-12-04 13:03:16 -04:00
Joey Hess
3549984cac
comment 2023-12-04 12:49:25 -04:00
Joey Hess
458b3d8e52
Merge branch 'master' of ssh://git-annex.branchable.com 2023-12-04 11:15:25 -04:00
Joey Hess
fd0b510573
improve message about 1 copy
"Could only verify the existence of 0 out of 1 necessary copy"
does not sound right, but neither does it with "copies".

Kept the "1" rather than "only" or such since numcopies is mentioned.

Sponsored-by: Brock Spratlen on Patreon
2023-12-04 11:12:54 -04:00
kdm9
39fed07289 Added a comment 2023-12-04 10:09:16 +00:00
brendan.ward@a2e11ad27f6b2fa2c556aea6811496e0d95dd0da
49374fd9c6 2023-12-04 06:43:03 +00:00
brendan.ward@a2e11ad27f6b2fa2c556aea6811496e0d95dd0da
4e7f4441bc 2023-12-04 06:41:28 +00:00
Atemu
a0540498b4 Added a comment 2023-12-03 21:11:19 +00:00
branch
9eee11d7a8 Added a comment 2023-12-03 11:57:56 +00:00
kdm9
92f37d0d49 new pidlock bug 2023-12-03 10:16:43 +00:00
kolam@976e5fa601b60de70b53dad291714218fd749169
98a0623ab6 rename forum/Can__39__t_access_file_from_secondary_client.mdwn to forum/client_repositories_setup_problem.mdwn 2023-12-02 19:06:00 +00:00
kolam@976e5fa601b60de70b53dad291714218fd749169
a055cb76ca 2023-12-02 18:16:00 +00:00
Joey Hess
edf31a2ebc
update 2023-12-01 15:01:45 -04:00