Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
622979432b
2 changed files with 77 additions and 0 deletions
|
@ -0,0 +1,57 @@
|
|||
[[!comment format=mdwn
|
||||
username="Spencer"
|
||||
avatar="http://cdn.libravatar.org/avatar/2e0829f36a68480155e09d0883794a55"
|
||||
subject="I need help with this too (c.f. submodule refactor)"
|
||||
date="2025-05-29T03:42:42Z"
|
||||
content="""
|
||||
I do this quite often because I use a monorepo approach with regular refactoring of subtrees into their own submodules. I have yet to find a bulletproof way to do this on the git-annex side.
|
||||
|
||||
The first step is as simple as `git annex unannex` in `A`, or including `--include \"*\"` if pattern matching is easier.
|
||||
|
||||
- On the `git` side, this logs the files as deleted from the main repo (`src`, let's call her). This is ideal so that you have a record for yourself (with a descriptive commit message) of where you've moved your files to.
|
||||
- On the `git-annex` side, (once you commit), the file data will eventually become \"unused\" - you'll have to do some combination of `git annex push` and `git annex sync [--cleanup]` to ensure all branches really don't reference those files (including remote branches and `synced/*` branches).
|
||||
|
||||
Now the question is: how do we get the data into the new repo (`dst`) and safely drop from `src`?
|
||||
|
||||
- You could add `dst` as a remote of `src` and pull only `dst`'s `git-annex` branch, which (after moving, re-annexing, and committing the unannexed files to `dst`) now shows as having a copy of those files. (**Warning:** this has bad side-effects).
|
||||
- You could do the opposite but use `dst` to move any (used) files from `src` (**Warning:** this has bad side-effects).
|
||||
- You could add `dst` as a remote and `move` unused files over (requires a clean unused stack already and having to do the push/sync stuff correctly and fully before the files can be released)
|
||||
- You could do the opposite and \"copy\" the files *to* `src` first *then* move them over to `dst`. (Required because per `dst`'s knowledge, it has no record of `src` having any keys. I find it logical albeit sad that `git-annex` can't dynamically poll local repos' annexes for file content)
|
||||
- You could forcibly drop the data either by individual key or once it eventually becomes unused (super unsafe and sad)
|
||||
|
||||
### Conclusions
|
||||
|
||||
- Keep a clean unused stack (`git annex unused` gives nothing) as much as you can, and clean it out before testing out any sort of move/drop operations like this.
|
||||
- Option 4 is the best so far. Following the initial step of `gx unannex` in `src`:
|
||||
- Add `src` as a remote in `dst`, `mv` files into `dst`, `gx add` files in `dst`, `gx copy` files from `dst` back to `src`, then do `gx move -f <src>`
|
||||
- This will only move the files known by `dst`. If it so happens that one of these files is actually duplicate data with something you want to also be in `src`, this *will* drop it and leave no record in `src` of where it went (besides your `git` commit message).
|
||||
|
||||
As described, there are still side effects with Option 4, but it's so far the best option I've devised.
|
||||
Oh, and if you want to keep `src` around as a remote on `dst` to e.g. remind yourself of various relations, make sure you configure it in `.git/config` with:
|
||||
|
||||
- `annex.sync=false`. This skips it when you do a `git annex sync`
|
||||
- Delete the `remote.fetch` spec, or add `remote.skipFetchAll=true`. This ensures `git fetch` doesn't fetch all the branch and unrelated objects
|
||||
- (pray there are no more side-effects)
|
||||
|
||||
Now, what happens if a side-effect does happen and it looks like you lost some content and don't know where it went? `git annex whereis` is no help.
|
||||
Instead, you have to extract the key from the now broken symlink and run `find <> -type f -iname \"<KEY>\"`. Easy enough but kind of scary when it happens to you.
|
||||
|
||||
### Side-Effects of Option 1+2: `git-annex` synchronization
|
||||
|
||||
*DON'T DEAD OPEN INSIDE*
|
||||
|
||||
While this is currently the only way to propagate annex key information, it has bad side-effects:
|
||||
|
||||
- Remotes and known repos start to clutter whichever absorbs the others' `git-annex` branch. For me this is a no-go because I have redundant remotes (an exporttree called `dropbox` in my case)
|
||||
- If you decide to `dead` these remotes or repos and by coincidence the `git-annex` branch is later absorbed in the other direction, chaos ensues (`dead` is propagated, remote annex key history is killed: especially gross for export/importtrees)
|
||||
- Best way to avoid this is to `dead`, `forget --drop-dead` then `semitrust UUID`. Many steps, potentially undefined condition. Gross.
|
||||
|
||||
## Potential Feature Requests
|
||||
|
||||
Ideally, I would wish `git-annex` could intelligently scan another repo's annex and populate information about what keys it has simply by what keys are objectively in `.git/annex/objects`. This pulls in the information we care about without cluttering additional information relevant only to each respective repo.
|
||||
Then, presuming you've set up a remote (`dst`) pointing to this repo (`src`) and run `git annex info`, then `src` should have a list of keys that are inside `dst`, and `gx whereis` from `src` will identify the keys inside `dst`, and `drop` will happily do so.
|
||||
|
||||
- Maybe there could be something called an `acquaintance` repo that is not allowed to be synced, pulled, fetched, pushed to.
|
||||
- Acquaintances are semitrusted because they're still annex-controlled.
|
||||
- On removing an acquaintance repo, and running `gx forget`, the list of keys is wiped.
|
||||
"""]]
|
|
@ -0,0 +1,20 @@
|
|||
[[!comment format=mdwn
|
||||
username="guez@e17c318e09fc77b4a5be4cd330364e3a41a96971"
|
||||
nickname="guez"
|
||||
avatar="http://cdn.libravatar.org/avatar/ffec09075c5b5cd47832649a306d68c3"
|
||||
subject="Not enough information on special remotes"
|
||||
date="2025-05-28T21:58:23Z"
|
||||
content="""
|
||||
You say that the command shows the url used for a WebDAV remote, but this does not seem to be the case any longer:
|
||||
|
||||
```
|
||||
$ git annex info sdrive
|
||||
uuid: d17d5946-d126-4a0e-b6c1-232fb34fb461
|
||||
description: sdrive
|
||||
trust: semitrusted
|
||||
remote annex keys: 1
|
||||
remote annex size: 249.11 kilobytes
|
||||
```
|
||||
|
||||
I can get a list of special remotes with `git annex enableremote` but how can I get a more detailed list, with all the information on each special remote: the type, the configuration options (encryption or not, etc.), the URLs?
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue