Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
fa641dad2d
20 changed files with 201 additions and 45 deletions
37
doc/bugs/Unicode_file_names_ignored_on_Windows.mdwn
Normal file
37
doc/bugs/Unicode_file_names_ignored_on_Windows.mdwn
Normal file
|
@ -0,0 +1,37 @@
|
|||
### Please describe the problem.
|
||||
|
||||
The "add" command silently ignores all files and directories with non-ascii characters.
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
I created empty repository (git init, git annex init). I created some files with ascii and nonascii file names (hacky.txt, háčky.txt).
|
||||
|
||||
git annex add . correctly adds only hacky.txt.
|
||||
|
||||
git annex add "háčky.txt" does nothing.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
git 1.9.0,
|
||||
git-annex installer from 2014-03-06
|
||||
|
||||
Windows XP and 7 with czech localization. CP1250 is used for czech characters on windows.
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
$ ls
|
||||
hacky.txt h????ky.txt
|
||||
$ git annex add .
|
||||
add hacky.txt ok
|
||||
(Recording state in git...)
|
||||
$ git annex status
|
||||
D h├í─Źky.txt
|
||||
|
||||
According to https://github.com/msysgit/msysgit/wiki/Git-for-Windows-Unicode-Support ls prints junk, but only to console.
|
||||
|
||||
D:\anntest>git annex add "háčky.txt" --debug
|
||||
[2014-03-18 14:28:03 Central Europe Standard Time] read: git ["--git-dir=D:\\anntest\\.git","--work-tree=D:\\anntest","-c","core.bare=false","ls-files","--others","--exclude-standard","-z","--","h\225\269ky.txt"]
|
||||
[2014-03-18 14:28:03 Central Europe Standard Time] chat: git ["--git-dir=D:\\anntest\\.git","--work-tree=D:\\anntest","-c","core.bare=false","cat-file","--batch"]
|
||||
[2014-03-18 14:28:03 Central Europe Standard Time] read: git ["--git-dir=D:\\anntest\\.git","--work-tree=D:\\anntest","-c","core.bare=false","ls-files","--modified","-z","--","h\225\269ky.txt"]
|
||||
|
||||
I can provide additional information, just tell me what you need.
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.154"
|
||||
subject="analysis"
|
||||
date="2014-03-18T17:54:09Z"
|
||||
content="""
|
||||
The `git ls-files --others -z output` is fine; the mojibake seems to occur in git-annex's reading of that output, which uses GHC's filesystem encoding. On Linux it reads \"h\225\269ky.txt\" but on Windows, \"h\195\161\196\56461ky.txt\".
|
||||
|
||||
So, it's failing to compose the multibyte characters, and it seems to have escaped the last byte (which should be \"\141\" based on the other 3) out into the high code plane used for undecodable bytes.
|
||||
|
||||
Note that on Linux with LANG=C, the add works, and it sees \"h\56515\56481\56516\56461ky.txt\" -- in this case, all 4 bytes are represented in the high code plane, and so round-trip through ok despite the locale not supporting the utf8 encoding.
|
||||
|
||||
Interestingly, while both `[readFile \"h\225\269ky.txt\", readFile \"h\56515\56481\56516\56461ky.txt\"]` work on Linux, only the former does on Windows.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.154"
|
||||
subject="comment 2"
|
||||
date="2014-03-18T18:09:08Z"
|
||||
content="""
|
||||
One approach might be to not use the GHC FileSystemEncoding on Windows, and assume that Windows filenames are always in a unicode encoding. After all, the FileSystemEncoding is only used by git-annex on Unix because Unix has no canonical encoding that will work for all filenames.
|
||||
|
||||
Hmm, nope, I tried this and it just causes an \"invalid byte sequence\" crash when reading from git-ls-files.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.154"
|
||||
subject="comment 3"
|
||||
date="2014-03-18T18:14:57Z"
|
||||
content="""
|
||||
ghc docs on FileSystemEncoding: \"On Windows, this encoding *should not* be used if possible because the use of code pages is deprecated: Strings should be retrieved via the wide W-family of UTF-16 APIs instead\"
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.154"
|
||||
subject="comment 4"
|
||||
date="2014-03-18T18:42:57Z"
|
||||
content="""
|
||||
As well as the default encoding, I've tried `utf8`, `utf16`, `utf16le`, and `utf16be` encodings, and none of them is able to successfully read the git ls-files output, all fail with encoding error. (I also tried `mkUTF16 RoundtripFailure` but it completely broke git-annex.)
|
||||
|
||||
Unsure where to go from here..
|
||||
"""]]
|
|
@ -1,37 +0,0 @@
|
|||
### Please describe the problem.
|
||||
I ran forget to clean up dead repos in my map, now map won't run.
|
||||
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
$ git annex forget --drop-dead --force
|
||||
....
|
||||
$ git annex map
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
(xubuntu)
|
||||
$ git annex version
|
||||
git-annex version: 5.20140306-g6e2e021
|
||||
build flags: Assistant Webapp Pairing Testsuite S3 WebDAV Inotify DBus XMPP Feeds Quvi TDFA CryptoHash
|
||||
key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL
|
||||
remote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier hook external
|
||||
local repository version: 5
|
||||
supported repository version: 5
|
||||
upgrade supported from repository versions: 0 1 2 4
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
[[!format sh """
|
||||
$ git annex map --debug
|
||||
map /media/archive/annex ok
|
||||
[2014-03-15 09:17:33 CDT] read: git ["config","--null","--list"]
|
||||
[2014-03-15 09:17:33 CDT] read: git ["config","--null","--list"]
|
||||
[2014-03-15 09:17:33 CDT] read: git ["config","--null","--list"]
|
||||
|
||||
git-annex: user error (git ["config","--null","--list"] exited 126)
|
||||
failed
|
||||
git-annex: map: 1 failed
|
||||
"""]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkBEmz5XoJVzN0u-0nOtpn7BBBDHsiLmxY"
|
||||
nickname="Eric"
|
||||
subject="comment 6"
|
||||
date="2014-03-17T20:12:31Z"
|
||||
content="""
|
||||
oh, so basically what i must have done was to delete the remote via git but it still existed in git annex?
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.154"
|
||||
subject="comment 5"
|
||||
date="2014-03-18T19:07:25Z"
|
||||
content="""
|
||||
cron was not running them for some reason, but they are up-to-date now.
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue