While some RawFilePath and FilePath remain, this converts most of
git-annex to using OsPath.
(When built without the OsPath build flag, is falls back to using
type OsPath = RawFilePath.)
The goals are
1) improved performance by using OsPath end-to-end when possible
2) potentially avoiding memory use problems caused by pinned strict
ByteString, since OsPath uses ShortByteString
3) eventually eliminating the filepath-bytestring dependency so I don't
need to keep maintaining that library
(this doesn't get all the way, but close)
4) generally improved type safety, since OsPath is a newtype, while
FilePath and RawFilePath are just type aliaes.
This is the result of a type checker driven process. I started by
converting from System.Directory to System.Directory.OsPath, and from
System.FilePath to System.OsPath. Then I fixed all the compile errors,
which took 3 weeks of work.
Unfortunately, there are several test suite failures at this point.
Also, it only has been built on linux, on windows and OSX there are
probably ifdefs whose code still needs to be converted.
Note that there is a parallel line of commits, starting with
05bdce328d
which is the incremental progress as I worked on this. It will be merged
with this commit. In some cases, commits in that line explain in more
details the reasons for some specific changes.
This removes that function, using file-io readFile' instead.
Had to deal with newline conversion, which readFileStrict does on
Windows. In a few cases, that was pretty ugly to deal with.
Sponsored-by: Kevin Mueller
In 793ddecd4b, writeSshConfig was made to
writeFile a ByteString, which lost the newline conversion on Windows.
Added linesFile to fix it. This will also be useful for other writeFile
conversions.
Now that truncateFilePath and relatedTemplate have both been optimised,
may as well use them in replaceFile, rather than the custom hack it
used.
Removed the windows-specific ifdef as well, because on Windows long
filepaths no longer really a problem, since ghc and git-annex use UNC
converted paths.
replaceFile no longer checks fileNameLengthLimit. That took a syscall,
and since we have an existing file, we know filenames of its length are
supported by the filesystem. Assuming that the withOtherTmp directory is
on the same filesystem as the file replaceFile is being called on, which
I believe it is.
Sponsored-by: Leon Schuermann
Often the filepath will be all ascii, or mostly so, and this
optimisation makes a file that has an ascii suffix of sufficient length
be roundtrip converted between String and ByteString only once, rather
than once per character.
Sponsored-by: Graham Spencer
And follow-on changes.
Note that relatedTemplate was changed to operate on a RawFilePath, and
so when it counts the length, it is now the number of bytes, not the
number of code points. This will just make it truncate shorter strings
in some cases, the truncation is still unicode aware.
When not building with the OsPath flag, toOsPath . fromRawFilePath and
fromRawFilePath . fromOsPath do extra conversions back and forth between
String and ByteString. That overhead could be avoided, but that's the
non-optimised build mode, so didn't bother.
Sponsored-by: unqueued
By using System.Directory.OsPath, which takes and returns OsString,
which is a ShortByteString. So, things like dirContents currently have the
overhead of copying that to a ByteString, but that should be less than
the overhead of using Strings which often in turn were converted to
RawFilePaths.
Added Utility.OsString and the OsString build flag. That flag is turned
on in the stack.yaml, and will be turned on automatically by cabal when
built with new enough libraries. The stack.yaml change is a bit ugly,
and that could be reverted for now if it causes any problems.
Note that Utility.OsString.toOsString on windows is avoiding only a
check of encoding that is documented as being unlikely to fail. I don't
think it can fail in git-annex; if it could, git-annex didn't contain
such an encoding check before, so at worst that should be a wash.