Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2020-05-21 14:47:51 -04:00
commit 75bfcca462
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 158 additions and 0 deletions

View file

@ -0,0 +1,28 @@
[[!comment format=mdwn
username="braun.markus89@51b521a42cc994db864df308627bd6454f9c309d"
nickname="braun.markus89"
avatar="http://cdn.libravatar.org/avatar/c11d06a0d9db6a9472b05ee01c342ca4"
subject="comment 2"
date="2020-05-20T13:54:23Z"
content="""
Thanks for your answer.
Short follow up question.
When I do exactly the same for a 2G file, something similar happens:
$ git annex sync --debug
[2020-05-20 15:48:19.441795963] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"show-ref\",\"git-annex\"]
[2020-05-20 15:48:19.459542967] process done ExitSuccess
[2020-05-20 15:48:19.460055539] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"show-ref\",\"--hash\",\"refs/heads/git-annex\"]
[2020-05-20 15:48:19.47249456] process done ExitSuccess
[2020-05-20 15:48:19.473466546] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"log\",\"refs/heads/git-annex..9655aad25802451eb83141096fb9275aa36fe810\",\"--pretty=%H\",\"-n1\"]
[2020-05-20 15:48:19.487917815] process done ExitSuccess
[2020-05-20 15:48:19.489243941] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch\"]
[2020-05-20 15:48:19.490737137] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch-check=%(objectname) %(objecttype) %(objectsize)\"]
commit
[2020-05-20 15:48:19.506415618] call: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"commit\",\"-a\",\"-m\",\"git-annex in admin@Paintower:~/git-annex/test\"]
fatal: mmap failed: Cannot allocate memory
So why does the \"git commit\" allocate so much memory? It seems like it tries to handle the file content itself? Or is it a malloc failure caused by git annex smudge?
"""]]

View file

@ -0,0 +1,17 @@
[[!comment format=mdwn
username="basile.pinsard@f1a7fae9f3bd9d5282fca11f62ad53b45a8eb317"
nickname="basile.pinsard"
avatar="http://cdn.libravatar.org/avatar/87e1f73acf277ad0337b90fc0253c62e"
subject="reverting metadata"
date="2020-05-21T13:39:37Z"
content="""
If I revert to a previous commit, the metadata changes are not reverted to their previous state.
Is there a way to revert metadata?
If I understood that correctly, the metadata are stored in a single git-annex branch: so there is no way to have two regular branches with different metadata for the same file, right?
Does any call to git-annex metadata creates a new commit in the git-annex branch?
The commit message of the git-annex branch are not super informative: they all say \"update\".
A related question: is there a way to get the git-annex branch commit that matches a regular/master branch commit?
Thanks.
"""]]

View file

@ -0,0 +1,52 @@
[[!comment format=mdwn
username="kyle"
avatar="http://cdn.libravatar.org/avatar/7d6e85cde1422ad60607c87fa87c63f3"
subject="comment 3"
date="2020-05-21T17:06:12Z"
content="""
> If I revert to a previous commit, the metadata changes are not
> reverted to their previous state.
Metadata is attached to keys, not files. (See `man
git-annex-metadata`.) If the state you revert to has the same key, it
will have the same associated metadata.
> Is there a way to revert metadata?
AFAIU not in a way that is tied to the HEAD commit. You can run `git
annex metadata --remove-all FILE`, but that will remove the metadata
on the underlying key.
> If I understood that correctly, the metadata are stored in a single
> git-annex branch: so there is no way to have two regular branches with
> different metadata for the same file, right?
Correct.
> Does any call to git-annex metadata creates a new commit in the
> git-annex branch?
I believe so, provided the caller hasn't overridden
`annex.alwayscommit` to \"false\".
> The commit message of the git-annex branch are not super
> informative: they all say \"update\".
In my view generic messages like \"update\" make a lot of sense in the
context of the behind-the-scenes git-annex branch. If you're
inspecting it just because you're curious about what's happening
underneath, you might find the output of `git log --stat git-annex`
helpful.
That being said, there is an `annex.commitmessage` config option if
you want to override the message.
https://git-annex.branchable.com/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/
> A related question: is there a way to get the git-annex branch commit
> that matches a regular/master branch commit?
I don't think there is any such mapping by design.
"""]]

View file

@ -0,0 +1,8 @@
Hey folks (and Joey), I am trying to understand the performance impact of changes in v6 -> v7 -> v8 mode. Apologies since I haven't kept up with the changes (was using older version for quite a bit) and some of these might already be well documented/known.
Essentially, back in v6 and earlier, I was pretty happy with the design idea the git annex doesn't use smudge/clean filters since their performance is far from ideal. However, I see that in newer versions of repos, this has become more of a thing. I have read a few docs (https://git-annex.branchable.com/todo/git_smudge_clean_interface_suboptiomal/, https://git-annex.branchable.com/todo/only_pass_unlocked_files_through_the_clean__47__smudge_filter/) but there's still a few thing I don't understand.
1) Are smudge/clean filter used all the time now? Does this mean that we are taking a performance hit compared to older git annex versions?
2) Can someone explain when smudge/clean filters get used? Is it only in repos that use unlock/adjust? I don't use either of them, and would love to know if these are being used unnecessarily.
Thanks in advance!

View file

@ -0,0 +1,33 @@
[[!comment format=mdwn
username="codelix"
avatar="http://cdn.libravatar.org/avatar/667ff4d0387694f28236639bab0faf2c"
subject="SO frustrating"
date="2020-05-21T17:55:09Z"
content="""
I am a big big fan of Joey and a big big fan of git annex, been using this for 7+ years. I absolutely love the reasoning that Joey does and how we identifies the best way to solve any problem.
But this is the first change that does what I consider to be a major mistake. It's essentially had me rethinking whether I can trust git annex anymore, and am tempted to continue using older versions which come with their own problems. It essentially all comes down to \"sane defaults\". Joey's reasonings are absolutely bang on, but optimizing for a very specific use case and silently doing things behind the scenes does not make sense.
For instance, git lfs does not add all files to lfs silently, but this essentially makes git annex do that in a sense.
My understanding is that this has been changed to some extent in recent versions, but I'm not 100% sure what the state is. In any case, I propose something like this
* Sane default -> git add ALWAYS adds files to git, git annex add ALWAYS add files to git annex, no exceptions
* Permit configuring which files get added to annex by git add, or to git by git annex add.
Honestly, it's becoming super confusing how all the different options like largefiles interact with each other. For this purpose, I suggest having a new namespace/version of configs that cleans this up, to maybe something like.
~~~
cat .gitaddtoannex
*.ogg
*.mp3
> 100MB
cat .gitannexaddtogit
.*
< 3MB
~~~
I feel this will simplify the part and making it super clear what will happen. Having the same behavior for `git mv` and `mv` can be handled as suggested already. Please let me know your thoughts on this, Joey.
Greatly appreciate all the work you are doing and hope we can continue to keep git annex the rock solid option it is. I think simplifying some of these configs will also help make it more accessible to less techie folks as well.
"""]]

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="codelix"
avatar="http://cdn.libravatar.org/avatar/667ff4d0387694f28236639bab0faf2c"
subject="comment 38"
date="2020-05-21T18:00:57Z"
content="""
Actually, even better it could be something like this
~~~
cat .gitannexinclude
+ *.ogg
+ *.mp3
+ > 100MB
- .*
- < 3MB
~~~
Any statement lower down overrides a statement higher up. Any file that does not match any of these patterns is automatically added to git. This will let us deprecate options like `largefiles` which are a source of lot of confusion for at least me.
"""]]