Merge branch 'master' into sqlite

2019-12-19 16:26:23 -04:00 · 2019-12-19 16:26:23 -04:00 · 02e00fd7ab
commit 02e00fd7ab
parent f6c18f6940 e8651fbd09
37 changed files with 314 additions and 128 deletions
--- a/doc/todo/addunlocked_config_setting/comment_3_6909726735abb7945930ba45632e4769._comment
+++ b/doc/todo/addunlocked_config_setting/comment_3_6909726735abb7945930ba45632e4769._comment
@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2019-12-19T15:29:40Z"
+ content="""
+Retargeting this todo at something useful post-git-add-kerfluffle,
+annex.addunlocked could usefully be a pagespec to allow adding some files
+unlocked and others locked (by git-annex add only, not git add).
+"true" would be the same as "anything" and false as "nothing".
+
+---
+
+It may also then make sense to let it be configured in .gitattributes.
+Although, the ugliness of setting a pagespec in .gitattributes,
+as was done for annex.largefiles, coupled with the overhead of needing to
+query that from git-check-attr for every file, makes me wary.
+
+(Surprising amount of `git-annex add` time is in querying the
+annex.largefiles and annex.backend attributes. Setting the former in
+gitconfig avoids the attribute query and speeds up add of smaller files by
+2%. Granted I've sped up add (except hashing) by probably 20% this month,
+and with large files the hashing dominates.)
+
+The query overhead could maybe be finessed: Since adding a file
+already queries gitattributes for two other things, a single query could be
+done for a file and the result cached.
+
+Letting it be globally configured via `git-annex config` is an alternative
+that I'm leaning toward.
+(That would also need some caching, easier to implement and faster
+since it is not a per-file value as the gitattribute would be.)
+"""]]
--- a/doc/todo/git-annex-cat/comment_3_347a33a4a77fd385ab8f3551138b75e1._comment
+++ b/doc/todo/git-annex-cat/comment_3_347a33a4a77fd385ab8f3551138b75e1._comment
@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Ilya_Shlyakhter"
+ avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
+ subject="named pipes as destination files"
+ date="2019-12-18T18:41:57Z"
+ content="""
+\"getting object content from remotes involve a destination file that is written to\" -- what happens if git-annex makes a named pipe, and passes that as the destination file name to the remote?
+"""]]
--- a/doc/todo/making_it_easier_to_smudge_dotfiles/comment_1_73ad5cd7f65c94a1db859d22eb6eece4._comment
+++ b/doc/todo/making_it_easier_to_smudge_dotfiles/comment_1_73ad5cd7f65c94a1db859d22eb6eece4._comment
@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2019-12-19T16:08:09Z"
+ content="""
+Hmm, it used to be that `git add .` would smudge all dotfiles without that
+line, but now annex.largefiles has to be configured for it to smudge
+anything.
+
+So, this could be dealt with in annex.largefiles. Both `anything` and
+`include=*` currently match dotfiles. It's kind of weird really that `*`
+matches dotfiles; it does not in the shell. If `*` did not match dotfiles
+(and `anything` is just an alias for `include=*`), it would be fairly safe
+to remove the `.* !filter` line by default. (If annex.largefiles has a
+content-based setting, and a dotfile is large enough or the right mime type
+or whatever, it's reasonable to default to smudging it.)
+
+Then, you could set annex.largfiles to match the dotfiles you want,
+eg `include=* or include=.mydotfile`. You could put the config in
+.gitattributes if you want to configure it globally.
+
+This change to annex.largefiles would also let `git-annex add`
+stop skipping dotfiles by default; instead annex.largefiles would not match
+dotfiles unless the user explicitly configured it to, and so the dotfiles
+would be added as small files, directly to git.
+
+I like this because it unifies the behaviors of the two ways of adding,
+and it reduces the complexity, rather than adding more.
+
+Removing the `.* !filter` line by default 
+would need to be done as part of the v8 upgrade, or a later upgrade.
+"""]]
--- a/doc/todo/making_it_easier_to_smudge_dotfiles/comment_2_3912c1194c3f4f0e4c372e7603e0a3e5._comment
+++ b/doc/todo/making_it_easier_to_smudge_dotfiles/comment_2_3912c1194c3f4f0e4c372e7603e0a3e5._comment
@ -0,0 +1,19 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2019-12-19T17:17:31Z"
+ content="""
+`*` is not only used in annex.largefiles, but other pagespecs too.
+Like preferred content:
+
+	exclude=archive/*
+
+So changing `*` to not match dotfiles would have wide reaching effects,
+and it's really not good for different versions of git-annex to parse
+preferred content expressions differently. And it seems too confusing to
+have `*` match differently in annex.largefiles than in other pagespecs.
+
+Having a single config that controls both kinds of adds still seems like a
+good idea, but I don't know what that config should be.
+annex.largedotfiles?
+"""]]
--- a/doc/todo/optimise_by_converting_Ref_to_ByteString.mdwn
+++ b/doc/todo/optimise_by_converting_Ref_to_ByteString.mdwn
@ -0,0 +1,3 @@
+Profiling of `git annex find --not --in web` suggests that converting Ref
+to contain a ByteString, rather than a String, would eliminate a
+fromRawFilePath that uses about 1% of runtime.
--- a/doc/todo/optimise_journal_access.mdwn
+++ b/doc/todo/optimise_journal_access.mdwn
@ -0,0 +1,21 @@
+Often a command will need to read a number of files from the git-annex
+branch, and it uses getJournalFile for each to check for any journalled
+change that has not reached the branch. But typically, the journal is empty
+and in such a case, that's a lot of time spent trying to open journal files
+that DNE.
+
+Profiling eg, `git annex find --in web` shows things called by getJournalFile
+use around 5% of runtime.
+
+What if, once at startup, it checked if the journal was entirely empty.
+If so, it can remember that, and avoid reading journal files.
+Perhaps paired with staging the journal if it's not empty.
+
+This could lead to behavior changes in some cases where one command is
+writing changes and another command used to read them from the journal and
+may no longer do so. But any such behavior change is of a behavior that
+used to involve a race; the reader could just as well be ahead of the
+writer and it would have already behaved as it would after the change.
+
+But: When a process writes to the journal, it will need to update its state
+to remember it's no longer empty. --[[Joey]]
--- a/doc/todo/optimize_by_converting_String_to_ByteString.mdwn
+++ b/doc/todo/optimize_by_converting_String_to_ByteString.mdwn
@ -9,7 +9,7 @@ Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by
 much more snappily. Other commands likely also speed up, but do more work
 than find so the improvement is not as large.

-The `bs` branch is in a mergeable state now.
+The `bs` branch is in a mergeable state now. [[done]]

 Stuff not entirely finished: