git-annex/doc/git-annex-fsck/comment_1_80ec8617d99d3f520c22b1e7fd741c16._comment

56 lines
3 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[[!comment format=mdwn
username="tapesafer"
avatar="http://cdn.libravatar.org/avatar/8a62b25ea58309a6e15cac10a5c33f1d"
subject="numcopies & force-trusting is ignored by fsck on readonly directory remotes?"
date="2024-09-04T14:50:16Z"
content="""
I have old readonly backup media, say something like
- `tapeA1/apples.txt`
- `tapeA2/apples.txt`
- `tapeB1/earth.svg`
- `tapeB2/earth.svg`
I use git-annex special directory remotes to be able to navigate the directory tree that lives on those media (e.g. to decide if and which media I need to find to copy a file from that I need).
I added the remotes like so (they are too big to import with content):
```
git annex initremote tapeA1 type=directory directory=/tapes/tapeA1 encryption=none importtree=yes
git annex import master:tapeA1 --from tapeA1 --no-content
git annex merge --allow-unrelated-histories tapeA1/main
```
At some point I may buy new hardware and recreate those backup media as proper git-annex remotes, but wouldn't it be great to keep the existing backups as long as they show no sign of bitrot and together hold enough copies?
Though, git-annex fsck behaves unexpected: It seems I cannot force trust these remotes nor does `--numcopies=0 --mincopies=0` have the desired effect.
Concretely, when calling `git annex fsck --from=tapeA1 --numcopies=0 --mincopies=0 --trust=tapeA1 --force`,
for every file that is still intact on tapeA1, git-annex fsck reports a failure as follows
```
fsck tapeA1/apples.txt
Only these untrusted locations may have copies of tapeA1/apples.txt
abc-def-ghi -- [tapeA1]
Back it up to trusted locations with git-annex copy.
failed
```
while I'd be happy to (semi)trust tapeA1 or to accept no copies whatsoever. So fsck ignores `--trust=tapeA1 --force` and/or `--numcopies=0 --mincopies=0` which are common git-annex options that should work for fsck?
Ideally, I would be able to (semi)trust my readonly tape remotes (which likely should be behind a `--force` as it may lead to data loss in classical directory remote settings). Then I can use git-annex to index those tapes, but also to monitor their health via fsck (so I can over the years replace the tapes that are showing signs of corruption).
As for the corruption, I emulated bitrot on a test directory remote, which then leads to a fsck failure as follows:
```
fsck tapeB2/earth.svg
verification of content failed
(checksum...)
tapeB2/earth.svg: Bad file content; failed to drop fromtapeB2: dropping content from this remote is not supported because it is configured with importtree=yes
```
This suffices to detect tapes that should be replaced, and it's kinda expected that files cannot be dropped.
Somehow fsck does not work as I would expect -- am I misunderstanding the numcopies/mincopies arguments here? Is there really no way to force-trust a directory remote, which to me seems appropriate in this case? Is there another way to achieve what I have in mind with git-annex?
Thanks for this great piece of software also use the assistant in another day-to-day usecase and it's simply great!
"""]]