the author of this forum post deleted it, so remove comments

This commit is contained in:
Joey Hess 2021-01-05 11:23:31 -04:00
parent 56df4030c3
commit f3312baa2c
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
13 changed files with 0 additions and 208 deletions

View file

@ -1,12 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 10"
date="2020-12-31T22:45:03Z"
content="""
You will be unsurprised to hear that what you suggested worked. not sure what helped other than me cleaning up my working tree and doing a solid git annex add .; git annex sync. I also removed annex.thin since its evidently not helping me.
thanks a ton. what got me here was me basically running through the \"splitting a repo\" process of making a new git repo, doing a cp -rl ./.git/annex/objects to the new repo and then running various tests on it. I just want to make sure I don't step on my own feet here.
thanks a ton.
"""]]

View file

@ -1,12 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 11"
date="2021-01-01T04:08:43Z"
content="""
couple of final notes:
* ```--reflog=always``` isn't a cp option, its reflink, and I am a moron.
* that same options on btrfs is the bomb. all of the advantages of hardlinks without the disadvantages.
"""]]

View file

@ -1,53 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""parent post is rife with incorrect and misleading statements"""
date="2021-01-04T20:03:19Z"
content="""
AFAIK there are no circumstances where git-annex will lose data unless you
use the --force flag, which is clearly documented as allowing data loss.
If you have a case where it does, *file a bug report**.
git-annex uninit does *not* delete .git/annex/objects if there are
any objects in there that are not used by files in the repo, so it can't
have behaved as you claim it did, at least as far as I can tell. Here is an
example of it not deleting data, in a situation like the one you claimed
caused data loss:
joey@darkstar:/tmp/demo>git annex add foo
joey@darkstar:/tmp/demo>git commit -m add
joey@darkstar:/tmp/demo>git rm foo
joey@darkstar:/tmp/demo>git annex uninit
git-annex: Not fully uninitialized
Some annexed data is still left in .git/annex/objects/
This may include deleted files, or old versions of modified files.
If you don't care about preserving the data, just delete the
directory.
Or, you can move it to another location, in case it turns out
something in there is important.
Or, you can run `git annex unused` followed by `git annex dropunused`
to remove data that is not used by any tag or branch, which might
take care of all the data.
Then run `git annex uninit` again to finish.
joey@darkstar:/tmp/demo>find .git/annex/objects/ -type f
.git/annex/objects/Zj/zZ/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83
I document changes before I implement them, and this website is updated
on every push of changes to git-annex. While some tip somewhere may be
out of date, the one you mentioned does not appear to be. It looks
like you misunderstood something about it.
A drive in a safe full of files with and without git-annex has identical
durability. Using git-annex does *not* cause file to be less accessible or
add significant roadblocks to accessing them no matter what problems might
befall that drive. Worst case, fsck of a corrupted filesystem on that drive
will rescue files to lost+found with the git-annex key name and not the
original filename. This is easy to recover from though, using `git-annex
reinject --known`. Which also, conventiently, works if fsck on a badly
damaged drive restores the file to lost+found using a bare inode number.
Which, if you're not using git-annex, puts you in a world of hurt to
determine what file that originally was.
"""]]

View file

@ -1,13 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 13"
date="2021-01-05T05:55:13Z"
content="""
Interesting. well I'll have to test that again. thanks for taking the time to read this & respond and thank you for making an awesome piece of software.
"""]]

View file

@ -1,8 +0,0 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 1"
date="2020-12-31T17:32:46Z"
content="""
I've now seen multiple people claiming that the documentation is out of date, but couldn't confirm it myself. Can you provide an example?
"""]]

View file

@ -1,14 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 2"
date="2020-12-31T18:55:29Z"
content="""
the most recent example I've run across is the use of
git.annex=thin
in the link here: https://git-annex.branchable.com/tips/unlocked_files/
it didn't result in a hardlink being made of the content for either git annex unlock or git annex unannex
instead I ended up getting the same functionality by use --fast.
"""]]

View file

@ -1,9 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 3"
date="2020-12-31T19:34:56Z"
content="""
right now I am driving myself crazy trying to understand why I have objects that *nothing is pointing to*, yet git annex unused fails to report them. these objects report 1 hardlink and they are from a migrated backend. I'll try git annex forget, but I really don't understand what is keeping these objects from being reported as unused.
"""]]

View file

@ -1,9 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 5"
date="2020-12-31T20:34:55Z"
content="""
I am digging into this further, and it looks like git annex uses cp --reflog=auto, confirmed with filefrag -v, but even if the object from the old backend isn't taking up space, its still frustrating that I can't figure out why git annex is keeping old files around and not reporting them via git annex unused.
"""]]

View file

@ -1,15 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 5"
date="2020-12-31T19:40:22Z"
content="""
https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
according to the link above it should be hardlinked to the new key for the new backend, but this isn't the case. this is on btrfs btw.
this is a test repo with no remotes as another data point.
also I migrated from SHA256E to SHA256.
I tried git annex forget --force; git annex sync; git annex unused, still it isn't showing the objects as unused.
"""]]

View file

@ -1,21 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 6"
date="2020-12-31T20:51:12Z"
content="""
after (re)reading the following:
https://git-annex.branchable.com/forum/switching_backends/
https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
I confirmed again that git annex sync was re-ran, there are no remotes, so that isn't a thing here. I checked out each git branch and did a
```find ./???/ -lname '*c0ade___this_is_a_long_hash___566fd3*'```
and nothing in any branch is pointed to this old backend key.
so I am both stymied and befuddled... any tips are appreciated.
"""]]

View file

@ -1,12 +0,0 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 7"
date="2020-12-31T21:46:33Z"
content="""
Hmm, you seem to have mixed a lot of things up here: <br>
1. You are not supposed to use `git annex unannex` to unlock a file. Just pretend this command doesn't exist for now and use `git annex unlock` instead. In general, look at the manpages of the commands. For example `man git-annex-unannex`. <br>
2. Before doing anything further, clean up your repository from the mistake above. First, add all unannexed files back to the annex with `git annex add .` (from the root of your repo) and then commit everything with `git annex sync`. `git status` should now output `nothing to commit, working tree clean`. <br>
3. After setting `git config annex.thin true` you are supposed to run `git annex fix`. That's exactly what the link you gave says. But as you are using btrfs, I suggest you not to use hard-links, as git annex makes use of reflinks already. <br>
4. Now that you have a clean worktree, try `git annex unused` again. If it still doesn't work post the full output of `git annex unused` here.
"""]]

View file

@ -1,19 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 8"
date="2020-12-31T22:00:07Z"
content="""
thanks for responding...
I used git annex unannex because I tried using git annex uninit and it DELETED my entire multi TB ./.git/annex/objects, even though I only had a handful of symlinks on in that repo, I wanted to find another way to unannex files that wouldn't delete my technically \"unused\" data.
and git annex unannex was what I tried when git annex unlock would not hardlink the files via annex.thin=true. it was only with toying with the 2 commands and finally --fast that I was able to get it to hardlink the files
my end goal was to be able to remove my data reliably from git annex entirely without it purging the object store.
and now as I read about hardlinks=true or whatever I see that git annex doesn't really love to hardlink multiple files past 2 because then multiple, independent files being modified would corrupt the object store.
I just want this thing to be reliable at scale. I put all my data into it but the speed is killing me, so I want to be able to get it out or split off data types to secondary git annexes, while having some idea of what it's doing under the covers so I don't get surprised.
"""]]

View file

@ -1,11 +0,0 @@
[[!comment format=mdwn
username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
nickname="eric.w"
avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
subject="comment 9"
date="2020-12-31T22:05:19Z"
content="""
I'll chew on the rest of your response, I was bent on hardlinks because I haven't messed with btrfs reflog COW thing much, but its likely clearly the way to go here, so all of my consternation with hardlinks is likely getting me nowhere. I am just always at 90% full and so I don't want to do anything that is going to run me out of space in the middle of an expensive operation.
anyways, thanks. I guess I just wish I had only put big files into my annex at first, though I would never have known how badly it fails at scale (on my hardware, etc.)
"""]]