git-annex/doc/todo/support-non-utf8-locales.mdwn

27 lines
1.2 KiB
Text
Raw Normal View History

2011-03-08 22:19:52 +00:00
Currenty, git-annex forces output, particularly of filenames, in a utf-8
locale.
Note that this does not mean it cannot be used with filenames in other
2011-03-08 22:23:13 +00:00
encodings. git-annex is entirely encoding agnostic when it comes to
manipulating filenames. It just *displays* their names always converted to
utf-8, which may not look right when you have a non-utf8 locale.
2011-03-08 22:19:52 +00:00
This had to be done to work around some bugs with haskell's handling
of filename encodings. In particular,
* [[bugs/unhappy_without_UTF8_locale]]: haskell crashes when told to output
a string with characters > 255 in a non-utf8 locale.
* [[bugs/problems_with_utf8_names]]: On many OSs, haskell expects
non-decoded raw char8 in FilePaths. In order to display a filename,
though, it needs to first be decoded, and git-annex currently assumes
it was encoded as utf8.
git-annex's behavior is unlikely to improve much until haskell's
support for utf8 filenames improves. --[[Joey]]
> [[done]] -- I just turned off all encoding handling on stdout and stderr,
> which avoids these problems nicely. Git-annex now displays just what it
> input, at least on platforms where haskell does not decode unicode in
> FilePaths. This will later be a problem when it gets localized, but for
> now works great. --[[Joey]]