This commit is contained in:
Joey Hess 2023-09-22 15:06:30 -04:00
parent 415e899741
commit 269a9494e1
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -22,11 +22,10 @@ What seems to be happening is that catCommit gets:
commitName = Just "F\56515\56489lix" commitName = Just "F\56515\56489lix"
Which is I think ok, that's a utf-8 surrogate. But then Which is I think ok, that's a utf-8 surrogate in the filesystem encoding.
that's passed into commitWithMetaData, which sets the environment Then that's passed into commitWithMetaData, which sets the environment
variable to its content. And setting an environment variable to a String variable to its content. And apparently it fails to be converted back to
like that does not pass it through the filesystem encoding. And so the the right bytes.
utf-8 surrogate is not converted back to the right bytes.
One fix would be to keep it a ByteString all the way though, using One fix would be to keep it a ByteString all the way though, using
`System.Posix.Env.ByteString`. I tried converting all environment in `System.Posix.Env.ByteString`. I tried converting all environment in
@ -38,6 +37,15 @@ variable that for some reason needs to get set by git-annex would
not incur mojibake -- it doesn't seem possible with the current library not incur mojibake -- it doesn't seem possible with the current library
ecosystem. ecosystem.
So, I think the best fix is to avoid commitWithMetaData using environment I tried making commitWithMetaData set the env var to a String that
variables. had the filesystem encoding applied. Eg `w82s (S.unpack (encodeBS v))`.
Interestingly, that failed:
git-annex: git: recoverEncode: invalid argument (cannot encode character '\195')
Which looks like the filesystem encoding is being applied after all?
And in System.Process.Posix, it does look like it does,
withCEnvironment uses withFilePath on the contents of env.
So huh, why then does the value not roundtrip?
"""]] """]]