Improve startup time for commands that do not operate on remotes

And for tab completion, by not unnessessarily statting paths to remotes,
which used to cause eg, spin-up of removable drives.

Got rid of the remotes member of Git.Repo. This was a bit painful.

Remote.Git modifies the list of remotes as it reads their configs,
so still need a persistent list of remotes. So, put it in as
Annex.gitremotes. It's only populated by getGitRemotes, so commands
like examinekey that don't care about remotes won't do so.

This commit was sponsored by Jake Vosloo on Patreon.
This commit is contained in:
Joey Hess 2018-01-09 15:36:56 -04:00
parent d0fe4d7308
commit 2b66492d6e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
22 changed files with 148 additions and 70 deletions

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2018-01-09T17:02:41Z"
content="""
There are a couple of parts to this, so let's get this one out of the way
first: Tab completion etc should not be looking at remotes.
It seems that even `git annex --help` does for some reason; so does
stuff like `git annex examinekey`. So it's happening in a core code-path.
Ah, ok.. Git.Config.read uses Git.Construct.fromRemotes,
which uses Git.Construct.fromAbsPath, which stats
the remote directory to handle ".git" canonicalization.
Fixed this part of it; now only when the remoteList is built does it
stat remotes.
"""]]

View file

@ -0,0 +1,39 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2018-01-09T19:56:42Z"
content="""
With the above dealt with, the remaining problem is with commands
like `git annex whereis` or `git annex info`, which don't really
any on any remote, but still need to examine the remotes as part of
building the remoteList.
git-annex supports remotes that point to a mount point that might have
different drives mounted at it at different times. So, it needs to
check the git config of the remote each time, to see what repository is
currently there.
Even commands like "whereis" and "info" have output that depends on
what repository a remote is currently pointing to. In some cases,
"whereis" might not output anything that depends on a given remote,
so in theory it could avoid looking at the config of that remote.
And a command like "git annex copy --to origin" doesn't really
need to look at the configs of any other remotes.
But to avoid unncessarily checking the git configs of remotes that a
command does not use would need each use of the current remoteList
to be replaced with something else that does the minimal needed work,
instead of building the whole remoteList. I think this would be quite
complicated.
And, I don't know that it would address the bug report adequequately, even
if it were done. Running `git annex info` would
still block waiting for the automount; `git annex whereis` would
only *sometimes* block, depending on where content is.
So instead of that approach, perhaps a config setting will do?
A per-remote config that tells git-annex that only one repository
should ever be mounted at its location. That would make git-annex
avoid checking the git config of that remote each time, except
when it's actually storing/dropping content on it.
"""]]