git-annex/doc/bugs/unhappy_without_UTF8_locale.mdwn
Joey Hess fe55b4644e Fix display of unicode filenames.
Internally, the filenames are stored as un-decoded unicode.
I tried decoding them, but then haskell tries to access the wrong files.
Hmm.

So, I've unhappily chosen option "B", which is to decode filenames before
they are displayed.
2011-02-10 14:21:44 -04:00

33 lines
1.1 KiB
Markdown

Try unsetting LANG and passing git-annex unicode filenames.
joey@gnu:~/tmp/aa>git annex add ./Üa
add add add add git-annex: <stdout>: commitAndReleaseBuffer: invalid
argument (Invalid or incomplete multibyte or wide character)
The same problem can be seen with a simple haskell program:
import System.Environment
import Codec.Binary.UTF8.String
main = do
args <- getArgs
putStrLn $ decodeString $ args !! 0
joey@gnu:~/src/git-annex>LANG= runghc ~/foo.hs Ü
foo.hs: <stdout>: hPutChar: invalid argument (Invalid or incomplete multibyte or wide character)
(The call to `decodeString` is necessary to make the input
unicode string be displayed properly in a utf8 locale, but
does not contribute to this problem.)
I guess that haskell is setting the IO encoding to latin1, which
is [documented](http://haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html#v:latin1)
to error out on characters > 255.
So this program doesn't have the problem -- but may output garbage
on non-utf-8 capable terminals:
import System.IO
main = do
hSetEncoding stdout utf8
args <- getArgs
putStrLn $ decodeString $ args !! 0