Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2024-02-22 11:19:02 -04:00
commit e041d4b9b5
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 182 additions and 0 deletions

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="matrss"
avatar="http://cdn.libravatar.org/avatar/59541f50d845e5f81aff06e88a38b9de"
subject="Multi-line string in WHEREIS-SUCCESS?"
date="2024-02-21T12:29:14Z"
content="""
Is it possible to somehow make `git annex whereis` show the response of the special remote to `WHEREIS` over multiple lines? Just including newlines obviously results in an error, since that ends the WHEREIS-SUCCESS message.
I am implementing a special remote for which the data is fully described by what is essentially a json-encoded request to a third-party API, and I would like to show this json string pretty-printed over multiple lines in the whereis output, instead of as a single line.
"""]]

View file

@ -0,0 +1,5 @@
I'm currently trying to migrate a git-annex repository to a new machine, one of whose remotes is the one from [this issue](https://git-annex.branchable.com/bugs/apparent_hang_in_git-annex-smudge/). I determined that the root cause there seemed to be under git rather than git-annex; in particular, any whole-repository operation would take multiple days to execute, for unclear reasons. Pulling the commit data to a new repository seems to fix this.
I'm now trying to move all the annexed data from the original, broken remote to the new one. The default option here would be `git annex move`. However, when I run this, it apparently does some git operation for every key moved, taking hours to days; there are tens of thousands of keys, so this is obviously unworkable.
Is there a way to simply mass-move the annexed data into the new repo, via rsync or similar, and then update the new repo's metadata all at once? The state of the old repository does not matter, since I intend to discard it as soon as this migration is done.

View file

@ -0,0 +1,50 @@
[[!comment format=mdwn
username="lell"
avatar="http://cdn.libravatar.org/avatar/4c4138a71d069e290240a3a12367fabe"
subject="Using fsck is an option?"
date="2024-02-22T07:47:36Z"
content="""
Hi, this worked for me:
```
~$ mkdir ~/test/gta
~$ cd ~/test/gta
~/test/gta$ git init
Initialized empty Git repository in /home/lell/test/gta/.git/
~/test/gta$ git annex init
~/test/gta$ git annex add file
~/test/gta$ git commit -m test
~/test/gta$ cd ..
~/test$ git clone a b
~/test$ cd gta/
~/test/gta$ cd ..
~/test$ git clone gta gta2
~/test$ cd gta2
~/test/gta2$ mkdir .git/annex/objects
~/test/gta2$ cp -r ../gta/.git/annex/objects/* .git/annex/objects/
~/test/gta2$ cp -r ../gta/.git/annex/objects .git/annex/objects
~/test/gta2$ git annex list # it does not yet know that the file is now also here
here
|origin
||web
|||bittorrent
||||
_X__ file
~/test/gta2$ git annex fsck
fsck file (fixing location log) (checksum...) ok
(recording state in git...)
$ git annex list # now it knows
here
|origin
||web
|||bittorrent
||||
XX__ file
```
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="lell"
avatar="http://cdn.libravatar.org/avatar/4c4138a71d069e290240a3a12367fabe"
subject="comment 2"
date="2024-02-22T07:49:58Z"
content="""
In the example above, of course I forgot one line that creates the file \"file\" in the repository gta, e.g. like this `echo test > file`
"""]]

73
doc/submodules/bug.mdwn Normal file
View file

@ -0,0 +1,73 @@
It's an enhancement feature to handle submodules to manage data with associated its projects.
I want `git-annex` could detect submodule paths changed on disks which was cause by `mv` or file explorer.
If user uses `git-annex-assist daemon` or `git-annex-assist` command directly after `mv` command, The submodules would be totally broken.
Currently, the workaround is just use `git-mv` on each submodules manually.
I made a testing shell script for this.
```shell
#!/bin/bash
# This is test script for submodule path changing.
# set -e
USE_GIT_MV=false # USE_GIT_MV=true works correctly
cd /tmp/
mkdir -p test_sub/{archive/projects,projects/2023_01_personal_some_cool_project,resources}
cd test_sub
git init
git annex init
git annex version
cd projects/2023_01_personal_some_cool_project
echo NOTE: Add some data and sub-projects for testing
touch README.md 01_dataset_lists.csv 09_reports.md
git submodule add https://github.com/Lykos153/git-annex-remote-googledrive.git
git submodule add https://github.com/alpernebbi/git-annex-adapter.git
git submodule status # check it
git annex assist
echo
echo NOTE: I think that the projects are need to be changed "01_Projects" for sorting order.
cd /tmp/test_sub
if $USE_GIT_MV; then
git mv projects 01_Projects
else
# NOTE: Just rename file makes submodules broken. directory depth is same
mv projects 01_Projects
(
cd 01_Projects/2023_01_personal_some_cool_project/git-annex-adapter
git status # it shows 'No such file or directory'
)
fi
git submodule status # check it
git annex assist
echo
echo NOTE: I want to change some submodule name is for referencing just for work.
cd /tmp/test_sub/01_Projects/2023_01_personal_some_cool_project
if $USE_GIT_MV; then
git mv git-annex-adapter ref_sample_adapter_code
else
# NOTE: Just rename file makes submodules broken. directory depth is same
mv git-annex-adapter ref_sample_adapter_code
fi
git submodule status # check it
git annex assist
echo
echo NOTE: Now, i want to archive my old projects.
cd /tmp/test_sub
if $USE_GIT_MV; then
git mv 01_Projects/2023_01_personal_some_cool_project archive/projects/2023_01_personal_some_cool_project
else
# NOTE: Just rename file makes submodules broken. directory depth is changed
mv 01_Projects/2023_01_personal_some_cool_project archive/projects/2023_01_personal_some_cool_project
fi
git submodule status # check it
git annex assist
echo
echo test done
```

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="TTTTAAAx"
avatar="http://cdn.libravatar.org/avatar/9edd4b69b9f9fc9b8c1cb8ecd03902d5"
subject="Datalad cannot detect submodule path changed on disk"
date="2024-02-22T08:34:47Z"
content="""
> Sounds like you might want to use datalad, which is built around git annex and where submodules are a first-class citizen.
Datalad handles submodules as subdatasets and add python code layers on it to handle datasets(e.g. dedup submodules). But it doesn't detect the submodules path changed like git.
So, it doesn't do my needs sadly.
"""]]

View file

@ -0,0 +1,24 @@
[[!comment format=mdwn
username="beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec"
nickname="beryllium"
avatar="http://cdn.libravatar.org/avatar/62b67d68e918b381e7e9dd6a96c16137"
subject="Colocating git-annex and git-lfs"
date="2024-02-22T02:55:25Z"
content="""
Is it possible to add git-lfs capabilities to a git-annex, without using a special remote?
I guess what I want is, are there any reasonable instructions to graft the hooks so that this is possible:
$ git init
$ git-lfs install
$ git-annex init
And you can alternate between something like below:
$ git-lfs track \"*.exif_thumbnail.*\"
$ git-annex add IMG_0001.jpg
$ git add IMG_0001.exif_thumbnail.jpg
Obviously this betrays the scenario of extracting thumbnails from the EXIF header and storing them alongside, as another form of metadata. If there's a better workflow to this, that would be appreciated too.
"""]]