git-annex/doc/internals/git-remote-annex.mdwn

52 lines
1.8 KiB
Text
Raw Normal View History

This adds two new object types to git-annex, GITMANIFEST and a GITBUNDLE.
proof of concent for push to git bundles with MANIFEST This is a shell script, so not final code, and it does not use git-annex at all, but it shows how to push to git bundles, listed in a MANIFEST, the same as the git-remote-annex program will eventually do. While developing this, I realized that the design needed to be changed slightly regarding where refs are stored. Since a push can delete a ref from a remote, storing each newly pushed ref in a bundle won't work, because deleting a ref would then entail deleting all old bundles and re-uploading from scratch. So instead, only the refs in the last bundle listed in the MANIFEST are the active refs. Any refs in prior bundles are just old refs that were stored previously (a reflog as it were). That means that, in a situation where two different people are pushing to the same special remote from different repos, whoever pushes last wins. Any refs pushed by the other person earlier will be ignored. This may not be desirable, and git-annex might be able use the git-annex branch to detect such situations and rescue the refs that got lost. Even without such a recovery process though, the refs that the other person thought they pushed will be preserved in their refs/namespaces/mine, so a pull followed by a push will generally resolve the situation. Note that the use of refs/namespaces/mine in the bundle is not really desirable, and it might be worth making a local clone of the repo in order to set up the refs that will be put in the bundle. Which seems to be the only way to avoid needing that. But it does need to maintain the refs/namespaces/mine/ in the git repo in order to remember what refs have been pushed to the remote before, in order to include them in the next bundle pushed. A name that includes the remote uuid will be needed in the final implementation. Anyway, this shell script seems to fully work, including incremental pushing, force pushing, and pushes that delete refs. Sponsored-by: Brett Eisenberg on Patreon
2024-04-25 20:38:34 +00:00
GITMANIFEST--$UUID is the manifest for a git repository stored in the
git-annex repository with that UUID.
GITBUNDLE--sha256 is a git bundle.
# format of the manifest file
An ordered list of bundle keys, one per line.
proof of concent for push to git bundles with MANIFEST This is a shell script, so not final code, and it does not use git-annex at all, but it shows how to push to git bundles, listed in a MANIFEST, the same as the git-remote-annex program will eventually do. While developing this, I realized that the design needed to be changed slightly regarding where refs are stored. Since a push can delete a ref from a remote, storing each newly pushed ref in a bundle won't work, because deleting a ref would then entail deleting all old bundles and re-uploading from scratch. So instead, only the refs in the last bundle listed in the MANIFEST are the active refs. Any refs in prior bundles are just old refs that were stored previously (a reflog as it were). That means that, in a situation where two different people are pushing to the same special remote from different repos, whoever pushes last wins. Any refs pushed by the other person earlier will be ignored. This may not be desirable, and git-annex might be able use the git-annex branch to detect such situations and rescue the refs that got lost. Even without such a recovery process though, the refs that the other person thought they pushed will be preserved in their refs/namespaces/mine, so a pull followed by a push will generally resolve the situation. Note that the use of refs/namespaces/mine in the bundle is not really desirable, and it might be worth making a local clone of the repo in order to set up the refs that will be put in the bundle. Which seems to be the only way to avoid needing that. But it does need to maintain the refs/namespaces/mine/ in the git repo in order to remember what refs have been pushed to the remote before, in order to include them in the next bundle pushed. A name that includes the remote uuid will be needed in the final implementation. Anyway, this shell script seems to fully work, including incremental pushing, force pushing, and pushes that delete refs. Sponsored-by: Brett Eisenberg on Patreon
2024-04-25 20:38:34 +00:00
The last bundle in the list provides all refs that are currently stored in
the repository. The bundles before it in the list can incrementally provide
objects, but not refs.
# fetching
1. download GITMANIFEST for the uuid of the special remote
2. download each listed GITBUNDLE object that we don't have
proof of concent for push to git bundles with MANIFEST This is a shell script, so not final code, and it does not use git-annex at all, but it shows how to push to git bundles, listed in a MANIFEST, the same as the git-remote-annex program will eventually do. While developing this, I realized that the design needed to be changed slightly regarding where refs are stored. Since a push can delete a ref from a remote, storing each newly pushed ref in a bundle won't work, because deleting a ref would then entail deleting all old bundles and re-uploading from scratch. So instead, only the refs in the last bundle listed in the MANIFEST are the active refs. Any refs in prior bundles are just old refs that were stored previously (a reflog as it were). That means that, in a situation where two different people are pushing to the same special remote from different repos, whoever pushes last wins. Any refs pushed by the other person earlier will be ignored. This may not be desirable, and git-annex might be able use the git-annex branch to detect such situations and rescue the refs that got lost. Even without such a recovery process though, the refs that the other person thought they pushed will be preserved in their refs/namespaces/mine, so a pull followed by a push will generally resolve the situation. Note that the use of refs/namespaces/mine in the bundle is not really desirable, and it might be worth making a local clone of the repo in order to set up the refs that will be put in the bundle. Which seems to be the only way to avoid needing that. But it does need to maintain the refs/namespaces/mine/ in the git repo in order to remember what refs have been pushed to the remote before, in order to include them in the next bundle pushed. A name that includes the remote uuid will be needed in the final implementation. Anyway, this shell script seems to fully work, including incremental pushing, force pushing, and pushes that delete refs. Sponsored-by: Brett Eisenberg on Patreon
2024-04-25 20:38:34 +00:00
3. `git bundle unpack` each bundle in order
4. `git fetch` from the last bundle listed in the manifest
# pushing (incrementally)
proof of concent for push to git bundles with MANIFEST This is a shell script, so not final code, and it does not use git-annex at all, but it shows how to push to git bundles, listed in a MANIFEST, the same as the git-remote-annex program will eventually do. While developing this, I realized that the design needed to be changed slightly regarding where refs are stored. Since a push can delete a ref from a remote, storing each newly pushed ref in a bundle won't work, because deleting a ref would then entail deleting all old bundles and re-uploading from scratch. So instead, only the refs in the last bundle listed in the MANIFEST are the active refs. Any refs in prior bundles are just old refs that were stored previously (a reflog as it were). That means that, in a situation where two different people are pushing to the same special remote from different repos, whoever pushes last wins. Any refs pushed by the other person earlier will be ignored. This may not be desirable, and git-annex might be able use the git-annex branch to detect such situations and rescue the refs that got lost. Even without such a recovery process though, the refs that the other person thought they pushed will be preserved in their refs/namespaces/mine, so a pull followed by a push will generally resolve the situation. Note that the use of refs/namespaces/mine in the bundle is not really desirable, and it might be worth making a local clone of the repo in order to set up the refs that will be put in the bundle. Which seems to be the only way to avoid needing that. But it does need to maintain the refs/namespaces/mine/ in the git repo in order to remember what refs have been pushed to the remote before, in order to include them in the next bundle pushed. A name that includes the remote uuid will be needed in the final implementation. Anyway, this shell script seems to fully work, including incremental pushing, force pushing, and pushes that delete refs. Sponsored-by: Brett Eisenberg on Patreon
2024-04-25 20:38:34 +00:00
1. create git bundle all refs that will be stored in the repository,
and objects since the previously pushed refs
2. hash to calculate GITBUNDLE key
3. upload GITBUNDLE object
4. download current manifest
5. append GITBUNDLE key to manifest
# pushing (replacing incrementals with single bundle)
proof of concent for push to git bundles with MANIFEST This is a shell script, so not final code, and it does not use git-annex at all, but it shows how to push to git bundles, listed in a MANIFEST, the same as the git-remote-annex program will eventually do. While developing this, I realized that the design needed to be changed slightly regarding where refs are stored. Since a push can delete a ref from a remote, storing each newly pushed ref in a bundle won't work, because deleting a ref would then entail deleting all old bundles and re-uploading from scratch. So instead, only the refs in the last bundle listed in the MANIFEST are the active refs. Any refs in prior bundles are just old refs that were stored previously (a reflog as it were). That means that, in a situation where two different people are pushing to the same special remote from different repos, whoever pushes last wins. Any refs pushed by the other person earlier will be ignored. This may not be desirable, and git-annex might be able use the git-annex branch to detect such situations and rescue the refs that got lost. Even without such a recovery process though, the refs that the other person thought they pushed will be preserved in their refs/namespaces/mine, so a pull followed by a push will generally resolve the situation. Note that the use of refs/namespaces/mine in the bundle is not really desirable, and it might be worth making a local clone of the repo in order to set up the refs that will be put in the bundle. Which seems to be the only way to avoid needing that. But it does need to maintain the refs/namespaces/mine/ in the git repo in order to remember what refs have been pushed to the remote before, in order to include them in the next bundle pushed. A name that includes the remote uuid will be needed in the final implementation. Anyway, this shell script seems to fully work, including incremental pushing, force pushing, and pushes that delete refs. Sponsored-by: Brett Eisenberg on Patreon
2024-04-25 20:38:34 +00:00
1. create git bundle containing all refs stored in the repository, and all
objects
2. hash to calculate GITBUNDLE object name
3. upload GITBUNDLE object
4. download current manifest
5. remove all old GITBUNDLES from the manifest, and add new GITBUNDLE at
the end. Note that it's possible for the manifest to contain GITBUNDLES
that were not in the last fetched manifest, if so those must be
preserved, and the new GITBUNDLE appended
# multiple GITMANIFEST files
Usually there will only be one per special remote, but it's possible for
multiple special remotes to point to the same object storage, and if so
multiple GITMANIFEST objects can be stored.
It follows that the UUID of the special remote has to be included in the
annex:// uri, to know which GITMANIFEST to use when cloning from it.