Commit graph

45471 commits

Author SHA1 Message Date
Joey Hess
55adbb6694
avoid trying to export tree to proxied exporttree=yes remotes
This avoids a lot of ugly messages when syncing with such a remote.
The export tree happens on the proxy side.
2024-08-07 13:00:19 -04:00
Joey Hess
6d96734128
updateproxy, updatecluster check annexobjects=yes
updateproxy, updatecluster: Prevent using an exporttree=yes special remote
that does not have annexobjects=yes, since it will not work.
2024-08-07 12:27:24 -04:00
Joey Hess
8864a9e353
update 2024-08-07 11:49:53 -04:00
Joey Hess
1e0f13ad7f
comment 2024-08-07 11:39:29 -04:00
Joey Hess
b8f8c38e88
Merge branch 'master' into exportreeplus 2024-08-07 11:28:21 -04:00
Joey Hess
509b23fa00
catch ClientError from withClientM
When getting from a P2P HTTP remote, prompt for credentials when required,
instead of failing.

This feels like it might be a bug in servant-client. withClientM's type
suggests it would not throw a ClientError. But it does in this case.
2024-08-07 11:24:34 -04:00
Joey Hess
43e1f590c9
comment 2024-08-07 10:47:47 -04:00
Joey Hess
1038567881
proxy stores received keys to known export locations
This handles the workflow where the branch is first pushed to the proxy,
and then files in the exported tree are later are copied to the proxied remote.

Turns out that the way the export log is structured, nothing needs
to be done to finalize the export once the last key is sent to it. Which
is great because that would have been a lot of complication. On
receiving the push, Command.Export runs and calls recordExportBeginning,
does as much as it can to update the export with the files currently
on it, and then calls recordExportUnderway. At that point, the
export.log records the export as "complete", but it's not really. And
that's fine. The same happens when using `git-annex export` when some
files are not available to send. Other repositories that have
access to the special remote can already retrieve files from it. As
the missing files get copied to the exported remote, all that needs
to be done is record each in the export db.

At this point, proxying to exporttree=yes annexobjects=yes special remotes
is fully working. Except for in the case where multiple files in the
tree use the same key, and the files are sent to the proxied remote
before pushing the tree.

It seems that even special remotes without annexobjects=yes will work if
used with the workflow where the git-annex branch is pushed before
copying files. But not with the `git-annex push` workflow.
2024-08-07 09:47:34 -04:00
matrss
3ccbcc5662 2024-08-07 12:12:29 +00:00
git-annex@82b5fddc759dffdf749b19add6f0be2a0c78b62c
d3cc84db3b 2024-08-07 12:05:53 +00:00
git-annex@82b5fddc759dffdf749b19add6f0be2a0c78b62c
e8f60e7daa 2024-08-07 12:04:42 +00:00
Yaroslav Halchenko
46339151be
Refresh standlone patch to avoid fuzz and offsets 2024-08-06 16:39:48 -04:00
Joey Hess
bb9b02b723
remove unused imports 2024-08-06 14:49:20 -04:00
Joey Hess
ba1cb517c0
update 2024-08-06 14:46:56 -04:00
Joey Hess
c53f61e93f
Merge branch 'master' into exportreeplus 2024-08-06 14:46:33 -04:00
Joey Hess
f01d872059
fixed 2024-08-06 14:42:46 -04:00
Joey Hess
b1024ff181
Merge remote-tracking branch 'origin/master' 2024-08-06 14:42:08 -04:00
Joey Hess
3cc03b4c96
fix file corruption when proxying an upload to a special remote
The file corruption consists of each chunk of the file being duplicated.
Since chunks are typically a fixed size, it would certianly be possible
to get from a corrupted file back to the original file. But this is still
bad data loss.

Reversion was in commit fcc052bed8.
Luckily that did not make the most recent release.
2024-08-06 14:41:19 -04:00
Joey Hess
3289b1ad02
proxying to exporttree=yes annexobjects=yes basically working
It works when using git-annex sync/push/assist, or when manually sending
all content to the proxied remote before pushing to the proxy remote.
But when the push comes before the content is sent, sending content does
not update the exported tree.
2024-08-06 14:21:23 -04:00
Joey Hess
be5c86c248
refine 2024-08-06 12:15:18 -04:00
Joey Hess
4750ffbd3b
finalized design for proxying to exporttree=yes annexobjects=yes special remotes 2024-08-06 11:45:45 -04:00
Joey Hess
84d27cf34f
update 2024-08-06 11:13:51 -04:00
matrss
6d1592f857 2024-08-06 12:44:18 +00:00
Spencer
66ff2bc833 Added a comment: D: Correct 2024-08-05 22:17:55 +00:00
Joey Hess
a535eaa176
rename from annexobjects location on export
(When possible, of course it may not be there, or it may get renamed from
there for another exported file first. Or the remote may not support
renames.)

This will avoids redundant uploads.

An example case where this is important: Proxying to a exporttree remote,
a file is uploaded to it but is not yet in an exported tree. When the
exported tree is pushed, the remote needs to be updated by exporting to
it. In this case, the proxy doesn't have a copy of the file, so it would
need to download it from annexobjects before uploading it to the final
location. With this optimisation, it can just rename it.

However: If a key is used twice in an exported tree, it seems a proxy
will need to download and reupload anyway. Unless a copy operation is
added to exporttree remotes..
2024-08-04 12:19:10 -04:00
Joey Hess
a3d96474f2
rename to annexobjects location on unexport
This avoids needing to re-upload the file again to get it to the
annexobjects location, which git-annex sync was doing when it was
preferred content.

If the file is not preferred content, sync will drop it from the
annexobjects location.

If the file has been deleted from the tree, it will remain in the
annexobjects location until an unused/dropunused pass is done.
2024-08-04 11:58:07 -04:00
Joey Hess
6b63449133
update
Decided not to use the annexobjects location for exportTempName.

There doesn't seem to be any actual benefit to doing that, because an
export that renames to exportTempName always renames it back from that
to another location.

Also the annexobjects directory won't actually help with the paired
rename issue.
2024-08-04 11:34:00 -04:00
Joey Hess
ee076b68f5
strong verification on retrieval from annexobjects location
The file in the annexobjects location may have been renamed from a
previously exported file that got deleted in a subsequent export.
Or it may be renamed to annexobjects temporarily before being renamed to
another name (to handle eg pairwise renames).

But, an exported file is not guaranteed to contain the content of the
key that the local repository last exported there. Another tree could
have been exported from elsewhere in the meantime.

So, files in annexobjects do not necessarily have the content of their
key. And so have to be strongly verified when retrieving. The same as
is done when retrieving exported files.
2024-08-04 11:24:21 -04:00
Joey Hess
fe01a1e7e1
design work on annexobjects remotes 2024-08-03 19:51:03 -04:00
Joey Hess
a4a06404d4
sync --content with annexobjects=true exporttree remotes 2024-08-03 11:39:23 -04:00
Joey Hess
9497bf7fdb
update 2024-08-02 18:50:57 -04:00
Joey Hess
9da2860812
Merge branch 'master' into exportreeplus 2024-08-02 18:45:44 -04:00
Joey Hess
c4352adf6a
in unexport, check for annexobjects presence before updating location log
The key may still be in the annexobjects location.
2024-08-02 18:43:10 -04:00
Joey Hess
069d90eab5
prevent removeKey from annexobjects=yes remote when the key is in the exported tree
Removing the key from the annexobjects location when it's in the
exported tree would leave it in the exported tree, and so succeeding
would update the location log incorrectly. But this also can't remove it
from the exported tree, because that would cause import tree to see a
file got deleted. So, refuse to remove in this situation.

It would be possible to remove from the annexobjects location and then
fail. Then if a key somehow got stored in both the annexobjects location
and the exported tree location(s), the duplicate would be resolved. Not
doing this because first, I don't know how that situation could happen,
and second, it seems wrong for a failed remove to have a side-effect
like that.
2024-08-02 16:45:52 -04:00
Joey Hess
34c10d082d
status 2024-08-02 14:15:05 -04:00
Joey Hess
c34d1da22a
Remove debug output (to stderr)
Accidentially included in last version. Only happens when running code that
uses remoteUrl.
2024-08-02 14:13:29 -04:00
Joey Hess
83fa76733f
status 2024-08-02 14:10:34 -04:00
Joey Hess
28b29f63dc
initial support for annexobjects=yes
Works but some commands may need changes to support special remotes
configured this way.
2024-08-02 14:07:45 -04:00
Joey Hess
169fd414eb
git-remote-annex: use annexLocationsBare
There was no good reason for it to be using annexLocationsNonBare,
and exporttree=yes annexobjects=yes is going to use annexLocationsBare,
so this should as well for consistency.

Since all returned ExportLocations are tried when retrieving objects,
this won't break backwards compatability.
2024-08-02 13:13:44 -04:00
Spencer
cb192eafed removed 2024-08-02 04:37:11 +00:00
Spencer
bec9df965a Added a comment: Necro 2024-08-02 04:10:32 +00:00
Spencer
38de446248 Added a comment: Necro 2024-08-02 04:10:14 +00:00
dmcardle
94e34ca139 2024-08-01 14:25:18 +00:00
dmcardle
b6810e6fee Added a comment 2024-08-01 14:23:41 +00:00
d@403a635aa8eaa8bfa8613acb6a375d9e06ed7001
9e0044e990 2024-08-01 14:19:25 +00:00
d@403a635aa8eaa8bfa8613acb6a375d9e06ed7001
9728762a2c Added a comment 2024-08-01 13:49:41 +00:00
Spencer
dc1f707875 Added a comment: @joey 2024-07-31 20:10:06 +00:00
Joey Hess
3a1f39fbdf
Avoid loading cluster log at startup
This fixes a problem with datalad's test suite, where loading the cluster
log happened to cause the git-annex branch commits to take a different
shape, with an additional commit.

It's also faster though, since many commands don't need the cluster log.

Just fill Annex.clusters with a thunk.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-07-31 15:54:14 -04:00
Joey Hess
7c6c3e703b
clean up build warnings when built w/o servant 2024-07-31 14:07:30 -04:00
Joey Hess
ffba57c9fc
cleanup comments on removed news post 2024-07-31 14:05:48 -04:00