Commit graph

34189 commits

Author SHA1 Message Date
Joey Hess
c6669990fb
update 2024-05-28 09:19:00 -04:00
nobodyinperson
f6c0f55ad1 Added a comment: Yep, would be nice! 2024-05-28 12:18:59 +00:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92
bab6d3e58f Added a comment: Re: worktree provisioning 2024-05-28 12:06:39 +00:00
Joey Hess
c2483f6e6d
update 2024-05-27 22:44:35 -04:00
derphysiker
f90511ec43 2024-05-27 18:33:44 +00:00
Joey Hess
0975e792ea
git-remote-annex: Fix error display on clone
cleanupInitialization gets run when an exception is thrown, so needs to
avoid throwing exceptions itself, as that would hide the error message
that the user needs to see.
2024-05-27 13:28:05 -04:00
Joey Hess
a766475d14
split out a todo 2024-05-27 12:50:46 -04:00
Joey Hess
e64add7cdf
git-remote-annex: support importrree=yes remotes
When exporttree=yes is also set. Probably it would also be possible to
support ones with only importtree=yes, by enabling exporttree=yes for
the remote only when using git-remote-annex, but let's keep this
simple... I'm not sure what gets recorded in .git/annex/ state
differently in the two cases that might cause a problem when doing that.

Note that the full annex:: urls generated and displayed for such a
remote omit the importree=yes. Which is ok, cloning from such an url
uses an exporttree=remote, but the git-annex branch doesn't get written
by this program, so once the real config is available from the git-annex
branch, it will still function as an importree=yes remote.
2024-05-27 12:35:42 -04:00
Joey Hess
5a48f7b34e
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-27 11:16:25 -04:00
Joey Hess
6f51ba740d
comment 2024-05-27 11:16:21 -04:00
derphysiker
9442937865 Added a comment 2024-05-25 13:00:08 +00:00
Joey Hess
5e0d9c2029
comment 2024-05-24 17:32:50 -04:00
Joey Hess
4c8e57b907
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-24 17:16:39 -04:00
Joey Hess
19418e81ee
git-remote-annex: Display full url when using remote with the shorthand url 2024-05-24 17:15:31 -04:00
derphysiker
2d380f9941 Added a comment 2024-05-24 20:32:15 +00:00
Joey Hess
04a256a0f8
work around git "defense in depth" breakage with git clone checking for hooks
This git bug also broke git-lfs, and I am confident it will be reverted
in the next release.

For now, cloning from an annex:: url wastes some bandwidth on the next
pull by not caching bundles locally.

If git doesn't fix this in the next version, I'd be tempted to rethink
whether bundle objects need to be cached locally. It would be possible to
instead remember which bundles have been seen and their heads, and
respond to the list command with the heads, and avoid unbundling them
agian in fetch. This might even be a useful performance improvement in
the latter case. It would be quite a complication to a currently simple
implementation though.
2024-05-24 15:49:53 -04:00
Joey Hess
6ccd09298b
convert srcref to a sha
This fixes pushing a new ref that is the same as something already
pushed. In findotherprereq, it compares two shas, which didn't work when
one is actually not a sha but a ref.

This is one of those cases where Sha being an alias for Ref makes it
hard to catch mistakes. One of these days those need to be
differentiated at the type level, but not today..
2024-05-24 15:33:35 -04:00
Joey Hess
96c66a7ca9
bug 2024-05-24 15:15:42 -04:00
Joey Hess
58301e40d2
sync with special remotes with an annex:: url
Check explicitly for an annex:: url, not just any url. While no built-in
special remotes set an url, except ones that can be synced with, it
seems possible that some external special remote sets an url for its own
use, but did not expect it to be used by git-annex sync et al.

The assistant also syncs with them.
2024-05-24 14:57:29 -04:00
Joey Hess
22bf23782f
initremote, enableremote: Added --with-url to enable using git-remote-annex
Also sets remote.name.fetch to a typical value, same as git remote add does.
2024-05-24 14:29:36 -04:00
Joey Hess
7d61a99da3
todo 2024-05-24 13:57:33 -04:00
Joey Hess
2670508b97
also broke git-remote-annex 2024-05-24 13:35:45 -04:00
Joey Hess
b792b128a0
verified checkprereq
The case documented in its comment worked in a test push and clone.
2024-05-24 13:06:29 -04:00
Joey Hess
90580a2fad
comment 2024-05-24 12:57:11 -04:00
Joey Hess
54fa0e2f79
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-24 12:53:48 -04:00
Joey Hess
1a3c60cc8e
git-remote-annex: avoid bundle object leakage in push race or interrupted push
Locally record the manifest before uploading it or any bundles,
and read it on the next push. Any bundles from the push that are
not included in the currently being pushed manifest will get added
to the outManifest, and so eventually get deleted.

This deals with an interrupted push that is not resumed and instead
something else is pushed. And it deals with a push race that overwrites
the manifest.

Of course, this can't help if one of those situations is followed by
the local repo being deleted. But that's equivilant to doing a git-annex
copy of a new annexed file to a special remote and then deleting the
special repo w/o pushing. In either case the special remote ends up with
a object in it that git-annex doesn't know about.
2024-05-24 12:47:32 -04:00
derphysiker
fc7655324e 2024-05-23 20:49:06 +00:00
Joey Hess
4a77c77d2e
comment 2024-05-22 06:21:27 -04:00
Joey Hess
264c51b4f4
comment 2024-05-22 06:06:18 -04:00
Joey Hess
19ddbf0d74
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-22 04:26:52 -04:00
Joey Hess
4131e31f5c
PATH_MAX 2024-05-22 04:26:36 -04:00
TTTTAAAx
f332234c84 2024-05-22 06:27:30 +00:00
nobodyinperson
1a2bd28a52 Added a comment 2024-05-22 05:02:49 +00:00
datamanager
4b64964072 Added a comment 2024-05-21 23:32:34 +00:00
datamanager
01a085c27d Added a comment 2024-05-21 23:27:17 +00:00
datamanager
d5ab807b55 Added a comment: sourcehut plays nicely 2024-05-21 22:39:28 +00:00
datamanager
5a5a4452f8 Scrub my identifying information! 2024-05-21 22:36:22 +00:00
Joey Hess
5fb307f1c5
comment 2024-05-21 17:47:55 -04:00
Joey Hess
aff6c12949
muh2 2024-05-21 17:43:09 -04:00
Joey Hess
6fe63e4615
muh 2024-05-21 17:42:16 -04:00
Joey Hess
938e714a11
bleh 2024-05-21 17:32:49 -04:00
Joey Hess
10a60183e1
guard pushEmpty 2024-05-21 12:12:44 -04:00
Joey Hess
14c79373c4
update 2024-05-21 12:05:44 -04:00
Joey Hess
be8de26b68
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-21 11:53:34 -04:00
Joey Hess
b3d7ae51f0
fix edge case where git-annex branch does not have config for enabled special remote
One way this could happen is cloning an empty special remote.
A later fetch would then fail.
2024-05-21 11:27:49 -04:00
Joey Hess
3e7324bbcb
only delete bundles on pushEmpty
This avoids some apparently otherwise unsolveable problems involving
races that resulted in the manifest listing bundles that were deleted.

Removed the annex-max-git-bundles config because it can't actually
result in deleting old bundles. It would still be possible to have a
config that controls how often to do a full push, which would avoid
needing to download too many bundles on clone, as well as needing to
checkpresent too many bundles in verifyManifest. But it would need a
different name and description.
2024-05-21 11:13:27 -04:00
Joey Hess
f544946b09
update 2024-05-21 10:20:30 -04:00
Joey Hess
f191f52343
force pushing also does a full push 2024-05-21 10:10:49 -04:00
Joey Hess
b042dfeb0e
emptying pushes only delete 2024-05-21 09:52:35 -04:00
Joey Hess
5d40759470
formalize problem description 2024-05-21 09:35:46 -04:00
nobodyinperson
1ab1ea0bcc Added a comment 2024-05-21 13:25:54 +00:00
datamanager
2135514bc7 Added a comment: Not git's fault, but probably your forge's 2024-05-21 13:12:23 +00:00
nobodyinperson
04aa259cfa Added a comment: Probably not git annex related, but a new git 'feature' 2024-05-21 10:49:58 +00:00
nobodyinperson
f0923985aa Added a comment: Seeing this for the first time today as well 2024-05-21 10:41:08 +00:00
datamanager
d94ce5319b A correction, and small update 2024-05-21 01:16:50 +00:00
datamanager
6280fac98b Initial thread posting 2024-05-21 01:16:02 +00:00
Joey Hess
644ed44ec1
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-20 15:52:44 -04:00
Joey Hess
3a38520aac
avoid interrupted push leaving remote without a manifest
Added a backup manifest key, which is used if the main manifest key is
not present. When uploading a new Manifest, it makes sure that it never
drops one key except when the other key is present.

It's entirely possible for the two manifest keys to get out of sync, due
to races. The main one wins when it's present, it is possible for the
main one being dropped to expose the backup one, which has a different
push recorded.
2024-05-20 15:41:09 -04:00
Joey Hess
594ca2fd3a
update 2024-05-20 14:52:06 -04:00
Joey Hess
34a6db4f15
improve recovery from interrupted push
On push, first try to drop all outManifest keys listed in the current
manifest file, which resumes from an interrupted push that didn't
get a chance to delete those keys.

The new manifest gets its outManifest populated with the keys that were
in the old manifest, plus any of the keys that were unable to be
dropped.

Note that it would be possible for uploadManifest to skip dropping old
keys at all. The old keys would get dropped on the next push. But it
seems better to delete stuff immediately rather than waiting. And the
extra work is limited to push and typically is small.

A remote where dropKey always fails will result in an outManifest that
grows longer and longer. It would be possible to check if the remote
has appendonly = True and avoid populating the outManifest. Of course,
an appendonly remote will grow with every git push anyway. And currently
only Remote.GitLFS sets that, which can't be used as a git-remote-annex
remote anyway.
2024-05-20 13:49:45 -04:00
btester4
cdcf558170 2024-05-20 07:47:09 +00:00
nobodyinperson
034d1a80ba Added a comment: Importing specific directories from sdcard and internal storage 2024-05-19 08:08:46 +00:00
yarikoptic
cb952e762d Added a comment 2024-05-16 22:22:50 +00:00
yarikoptic
3337236bd9 initial report about multiple UUIDs and names for the same remote 2024-05-16 22:17:49 +00:00
Joey Hess
7c7136b6b9
devblog 2024-05-16 13:23:36 -04:00
Joey Hess
ce60211881
add incremental vs full push race to todo
with plan to deal with it
2024-05-16 09:37:28 -04:00
Joey Hess
468de43d66
Merge branch 'master' into git-remote-annex 2024-05-15 17:49:12 -04:00
Joey Hess
b1b6e35d4c
reorg todo 2024-05-15 17:41:55 -04:00
Joey Hess
adcebbae47
clean up git-remote-annex git-annex branch handling
Implemented alternateJournal, which git-remote-annex
uses to avoid any writes to the git-annex branch while setting up
a special remote from an annex:: url.

That prevents the remote.log from being overwritten with the special
remote configuration from the url, which might not be 100% the same as
the existing special remote configuration.

And it prevents an overwrite deleting of other stuff that was
already in the remote.log.

Also, when the branch was created by git-remote-annex, only delete it
at the end if nothing else has been written to it by another command.
This fixes the race condition described in
797f27ab05, where git-remote-annex
set up the branch and git-annex init and other commands were
run at the same time and their writes to the branch were lost.
2024-05-15 17:33:38 -04:00
Joey Hess
d24d8870c5
todo 2024-05-15 14:33:13 -04:00
Joey Hess
2dfffa0621
bugfix
When pushing branch foo, we don't want to delete other tracking
branches. In particular, a full push needs all the tracking branches.
2024-05-14 16:17:27 -04:00
Joey Hess
169e673ad4
result of some testing 2024-05-14 16:01:24 -04:00
Joey Hess
7dd2a67c41
fix names of new git configs 2024-05-14 15:33:47 -04:00
Joey Hess
0722c504c5
update docs for git-remote-annex 2024-05-14 15:31:16 -04:00
Joey Hess
23c4125ed4
mention other commands shipped with git-annex in SEE ALSO in man page 2024-05-14 15:23:45 -04:00
Joey Hess
24af51e66d
git-annex unused --from remote skips its git-remote-annex keys
This turns out to only be necessary is edge cases. Most of the
time, git-annex unused --from remote doesn't see git-remote-annex keys
at all, because it does not record a location log for them.

On the other hand, git-annex unused does find them, since it does not
rely on the location log. And that's good because they're a local cache
that the user should be able to drop.

If, however, the user ran git-annex unused and then git-annex move
--unused --to remote, the keys would have a location log for that
remote. Then git-annex unused --from remote would see them, and would
consider them unused. Even when they are present on the special remote
they belong to. And that risks losing data if they drop the keys from
the special remote, but didn't expect it would delete git branches they
had pushed to it.

So, make git-annex unused --from skip git-remote-annex keys whose uuid
is the same as the remote.
2024-05-14 15:17:40 -04:00
Joey Hess
0bf72ef103
max-git-bundles config for git-remote-annex 2024-05-14 14:23:40 -04:00
Joey Hess
8ad768fdba
todo 2024-05-14 13:58:35 -04:00
Joey Hess
6f1039900d
prevent using git-remote-annex with unsuitable special remote configs
I hope to support importtree=yes eventually, but it does not currently
work.

Added remote.<name>.allow-encrypted-gitrepo that needs to be set to
allow using it with encrypted git repos.

Note that even encryption=pubkey uses a cipher stored in the git repo
to encrypt the keys stored in the remote. While it would be possible to
not encrypt the GITBUNDLE and GITMANIFEST keys, and then allow using
encryption=pubkey, it doesn't currently work, and that would be a
complication that I doubt is worth it.
2024-05-14 13:52:20 -04:00
Joey Hess
e154c6da92
bug report (copied from email) 2024-05-13 17:11:34 -04:00
Joey Hess
8bf6dab615
update 2024-05-13 14:42:25 -04:00
Joey Hess
ddf05c271b
fix cloning from an annex:: remote with exporttree=yes
Updating the remote list needs the config to be written to the git-annex
branch, which was not done for good reasons. While it would be possible
to instead use Remote.List.remoteGen without writing to the branch, I
already have a plan to discard git-annex branch writes made by
git-remote-annex, so the simplest fix is to write the config to the
branch.

Sponsored-by: k0ld on Patreon
2024-05-13 14:35:17 -04:00
Joey Hess
552b000ef1
update 2024-05-13 14:30:18 -04:00
Joey Hess
34eae54ff9
git-remote-annex support exporttree=yes remotes
Put the annex objects in .git/annex/objects/ inside the export remote.
This way, when importing from the remote, they will be filtered out.

Note that, when importtree=yes, content identifiers are used, and this
means that pushing to a remote updates the git-annex branch. Urk.
Will need to try to prevent that later, but I already had a todo about
that for other reasons.

Untested!

Sponsored-By: Brock Spratlen on Patreon
2024-05-13 11:48:00 -04:00
Joey Hess
3f848564ac
refuse to fetch from a remote that has no manifest
Otherwise, it can be confusing to clone from a wrong url, since it fails
to download a manifest and so appears as if the remote exists but is empty.

Sponsored-by: Jack Hill on Patreon
2024-05-13 09:47:21 -04:00
Joey Hess
97b309b56e
extend manifest with keys to be deleted
This will eventually be used to recover from an interrupted fullPush
and drop the old bundle keys it was unable to delete.

Sponsored-by: Luke T. Shumaker on Patreon
2024-05-13 09:09:33 -04:00
Joey Hess
dfb09ad1ad
preparing to merge git-remote-annex
Update its todo with remaining items.

Add changelog entry.

Simplified internals document to no longer be notes to myself, but
target users who want to understand how the data is stored
and might want to extract these repos manually.

Sponsored-by: Kevin Mueller on Patreon
2024-05-10 15:06:15 -04:00
Joey Hess
ff5193c6ad
Merge branch 'master' into git-remote-annex 2024-05-10 14:20:36 -04:00
Joey Hess
947cf1c345
back to annex:: for git-remote-annex url
Oh, turns out git needs two colons to use a gitremote-helper. Ok.
2024-05-07 14:37:29 -04:00
Joey Hess
c7731cdbd9
add Backend.GitRemoteAnnex
Making GITBUNDLE be in the backend list allows those keys to be
hashed to verify, both when git-remote-annex downloads them, and by other
transfers and by git fsck.

GITMANIFEST is not in the backend list, because those keys will never be
stored in .git/annex/objects and can't be verified in any case.

This does mean that git-annex version will include GITBUNDLE in the list
of backends.

Also documented these in backends.mdwn

Sponsored-by: Kevin Mueller on Patreon
2024-05-07 13:54:08 -04:00
Joey Hess
483887591d
working toward git-remote-annex using a special remote
Not quite there yet.

Also, changed the format of GITBUNDLE keys to use only one '-'
after the UUID. A sha256 does not contain that character, so can just
split at the last one.

Amusingly, the sha256 will probably not actually be verified. A git
bundle contains its own checksums that git uses to verify it. And if
someone wanted to replace the content of a GITBUNDLE object, they
could just edit the manifest to use a new one whose sha256 does verify.

Sponsored-by: Nicholas Golder-Manning
2024-05-06 16:28:04 -04:00
Joey Hess
f4ba6e0c1e
add annex: url parser
Changed the format of the url to use annex: rather than annex::

The reason is that in the future, might want to support an url that
includes an uriAuthority part, eg:

annex://foo@example.com:42/358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/"

To parse that foo@example.com:42 as an uriAuthority it needs to start with
annex: rather than annex::

That would also need something to be done with uriAuthority, and also
the uriPath (the UUID) is prefixed with "/" in that example. So the
current parser won't handle that example currently. But this leaves the
possibility for expansion.

Sponsored-by: Joshua Antonishen on Patreon
2024-05-06 14:50:41 -04:00
Joey Hess
306ea42447
improve git-remote-annex docs
renamed the git config to something shorter too
2024-05-06 13:06:22 -04:00
Joey Hess
0be9f7a2c6
add UUID to GITBUNDLE
The UUID is included in the GITMANIFEST in order to allow a single
key/value store to be used to store several special remotes, without any
namespacing. In that situation though, if the same ref is pushed to two
special remotes, it will result in git bundles with the same content.

Which is ok, until a re-push happens to one of the special remote.
At that point, the old git bundle will be deleted. That will prevent
fetching it from the other special remote, where the re-push has not
happened.

Adding the UUID avoids this problem.
2024-05-06 12:51:44 -04:00
Joey Hess
a8cef2bf85
added man page for git-remote-annex
And document remote.<name>.git-remote-annex-max-bundles which will
configure it.

datalad-annex uses a similar url format, but with some enhancements.
See https://github.com/datalad/datalad-next/blob/main/datalad_next/gitremotes/datalad_annex.py

I added the UUID to the URL, because it is needed in order to pick out which
manifest file to use. The design allows for a single key/value store to have
several special remotes all stored in it, and so the manifest includes
the UUID in its name.

While datalad-annex allows datalad-annex::<url>?, and allows referencing
peices of the url in the parameters, needing the UUID prevents
git-remote-annex from supporting that syntax. And anyway, it is a
complication and I want to keep things simple for now.

Sponsored-by: unqueued on Patreon
2024-05-06 12:48:04 -04:00
Joey Hess
90b389369f
fix name of gitremote-helpers
The git man page has that name.
2024-05-06 12:07:05 -04:00
Joey Hess
4007d7234b
update 2024-05-06 11:36:43 -04:00
Joey Hess
5f61667f27
note on cycles 2024-05-02 12:22:04 -04:00
Joey Hess
4c538b0bb9
question 2024-05-02 11:15:35 -04:00
Joey Hess
883328b615
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-02 11:11:19 -04:00