Remove unnecessary rpaths in the git-annex binary, but only when it's built using make, not cabal. This speeds up git-annex statup time by around 50%.

This commit is contained in:
Joey Hess 2016-07-06 14:40:18 -04:00
parent f541e60f13
commit c4229be9a7
Failed to extract signature
6 changed files with 67 additions and 1 deletions

View file

@ -6,6 +6,9 @@ git-annex (6.20160614) UNRELEASED; urgency=medium
* get, drop: Add --batch and --json options.
* New url for git-remote-gcrypt, now maintained by spwhitton.
* testremote: Fix crash when testing a freshly made external special remote.
* Remove unnecessary rpaths in the git-annex binary, but only when
it's built using make, not cabal.
This speeds up git-annex statup time by around 50%.
-- Joey Hess <id@joeyh.name> Mon, 13 Jun 2016 21:52:24 -0400

View file

@ -29,6 +29,8 @@ git-annex: Build/SysConfig.hs
else \
ln -sf dist/build/git-annex/git-annex git-annex; \
fi
# Work around https://github.com/haskell/cabal/issues/3524
@chrpath -d git-annex || echo "** unable to chrpath git-annex; it will be a little bit slower than necessary"
# These are not built normally.
git-union-merge.1: doc/git-union-merge.mdwn

1
debian/control vendored
View file

@ -85,6 +85,7 @@ Build-Depends:
curl,
openssh-client,
git-remote-gcrypt (>= 0.20130908-6),
chrpath,
Maintainer: Richard Hartmann <richih@debian.org>
Standards-Version: 3.9.8
Vcs-Git: git://git.kitenet.net/git-annex

View file

@ -1 +0,0 @@
binary-or-shlib-defines-rpath

View file

@ -3,3 +3,5 @@ Since in datalad we are invoking git and git-annex quite frequently, and on debi
just an idea
[[!meta author=yoh]]
> [[fixed|done]], but without prelinking. --[[Joey]]

View file

@ -0,0 +1,59 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2016-07-06T15:59:34Z"
content="""
Startup time is also particulary important when git-annex is being used as
a smudge/clean filter in v6 mode, since it's run once per file git operates
on.
---
What I'd look at before prelinking is, does your git-annex executable
dynamically link haskell libraries?
That was the case for a while in the standalone builds, until I noticed it
caused too much linker time and put it back to static linking of the
haskell libs. Leaving only 34 or so C shared libs.
---
Did some preliminary benchmarking here of `git-annex version --raw`
* deb package build: 0.04 seconds min
* deb package build prelinked: ~0.03 seconds min
* standalone build: 0.05 seconds min
* git-annex modified to print "hi" and exit immediately: 0.02 seconds min
So, the overhead of the wrapper scripts for the standalone build is around
0.01 seconds.
And, prelinking does help a little bit (although probably closer to 0.005
seconds than 0.01; my measurements are too coarse to get a good number).
Meanwhile, 0.02 seconds are used after git-annex starts up. This overhead
includes finding the path to the git repository, running and parsing `git
config --list`, etc.
But what about that 0.02 seconds just to print "hi"...?
----
With strace I noticed a very interesting thing. Despite being statically
linked against the haskell libraries, the linker searches in all their
paths for all C libraries. This adds around 30000 failed open() calls
to git-annex's startup. This is done even after prelinking. It must be a
significant part of the startup time.
Filed a bug: <https://github.com/haskell/cabal/issues/3524>
Put in a chrpath workaround, but only when git-annex is built with "make"
(not cabal install git-annex).
Updated benchmarks:
* deb package build: 0.02 seconds min
* deb package build prelinked: ~0.02 seconds min
* standalone build: 0.03 seconds min
* git-annex modified to print "hi" and exit immediately: 0.01 seconds min
"""]]