Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.
As well as being an optimisation, this helps move toward eliminating use of
missingh.
(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)
I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
Almost working, but there's a bug in the relaying.
Also, made tor hidden service setup pick a random port, to make it harder
to port scan.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
Currently only done for utf-8 locales because the charset can easily be
told for those. Other locales don't include the charset in their name.
The locale definition is generated under git-annex.linux/locales.
So, this only works if the user can write there.
If locale generation fails for any reason, it's silently skipped.
The git-annex-standalone.deb installs the bundle under /usr, so this locale
generation won't work for non-root users.
Version mismatches between the system locale-archive and the glibc in the
bundle have been observed to cause git crashes.
Unfortunately, this causes locales to not be used in the linux standalone
bundle, as was the case until version 6.20160419.
glibc hardcodes the path to /usr/lib/locale/locale-archive and does not
let an environment variable cause a different locale-archive file to be used.
The only other option to include locales in the bundle would be to include
exploded locale definition directories in the bundle for a number of
locales, generated by localedef. But these take at least 300 kb per locale,
and there are a great many locales; it would be hundreds of megabytes to
include them all.
(Hmm, we could include localdef in the bundle, and check LANG in runshell
and compile the locale directories on the fly. This would need
/usr/share/i18n/ and /usr/lib/locale-archive to be included in the bundle.
It's.. doable.)
I know this is going to once again cause users of the bundle to complain
that eg, ls doesn't show their unicode filenames right. Better than strange
crashes though.
This was originally done in a7ef05a9, but got lost in some change to the
Makefile. Use CROSS_COMPILE=Android to tell configure that it's configuring
for android instead of passing it a parameter.
The code block in git-annex-smudge(1) was misformatted. Code blocks
start with tabs, so replace "\s" with " ". Tested to not affect
anything except git-annex-smudge.1.
This actually runs faster than building the man pages from the makefile
did. But the main purpose is to let Setup.hs import Build.Mans and so not
need the makefile.
The tarball on hackage will include only the files needed for cabal install;
it is NOT the full git-annex source tree. While it's totally obnoxious that
cabal files need every file listed out when basic wildcard support could
avoid hundreds of lines, and have to be maintained when files are added,
this does get the tarball size back down to 1 mb.
This also stops stack from complaining that it found modules not listed in
the cabal file.
debian/changelog, debian/NEWS, debian/copyright: Converted to symlinks
to CHANGELOG, NEWS, and COPYRIGHT, which used to symlink to these instead.
This avoids needing to include debian/ in the hackage tarball.
Setup.hs: Build man pages at install time using make and mdwn2man.
If it fails, which it probably will on windows, just skip installing
them.
It started exporting a isSymbolicLink which supports windows. But,
git-annex does no use symlinks on windows yet and this conflicts with the
function by the same name from unix-compat, so hide it.
Had to workaround various problems in clamscan. Increased its max filesize
a lot, because it's too small to check git-annex. Manual unpacking seemed
to be needed for dmg and tar.gz.
This is pretty complicated, but I have both "git-annex" and "git annex"
working both in the git bash shell even with git not added to path.
And, when git's added to path, both work from MS-DOS prompt window too.
I think that the webapp startup does still need git in path, so
instructions will keep saying to do that. But, users often disregard them,
and hopefully this will reduce support traffic.
Also, switched the wget from the cygwin one to the msys2 one, avoiding the
complication of needing to bundle any cygwin dlls.
Using msysgit with git-annex is no longer supported.
At the same time, I'm updating the rsync.exe in my downloads repository
with the one from msys2.
Note that rsync is currently still being ldded and installed in Git/cmd/
like the other cygwin programs. The ldd fails and this failure is ignored.
It would be better to special case it to go in Git/usr/bin/, so that the
user can't run rsync in a dos prompt window, which doesn't work, as it needs
additional libs. However, as far as git-annex running rsync running ssh,
it works ok in this location.
Removed the ssh.cmd and ssh-keygen.cmd; these are not needed with git for
windows. Keeping them would let ssh be run manually from a dos prompt
window, but that's not really a goal.
It was failing at link time, some problem with terminatePID.
Re-implemented that to not use a C wrapper function, which cleared up the
problem. Removed old EvilLinker hack with must have been related to the
same problem.
Note that I have not tested this with older ghc's. In
f11f7520b5 I mention having tried this
approach before, and getting segfaults.. So, who knows. It seems to work
fine with ghc 7.10 at least.
As a result of the Makefile changes, the Debian package is built
with various hardening options. Although their benefit to a largely
haskell program is unknown.
This removes a bit of complexity, and should make things faster
(avoids tokenizing Params string), and probably involve less garbage
collection.
In a few places, it was useful to use Params to avoid needing a list,
but that is easily avoided.
Problems noticed while doing this conversion:
* Some uses of Params "oneword" which was entirely unnecessary
overhead.
* A few places that built up a list of parameters with ++
and then used Params to split it!
Test suite passes.