Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2019-04-28 09:53:40 -04:00
commit 5d55f968cc
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -0,0 +1,61 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="change in libmagic behavior"
date="2019-04-26T15:20:22Z"
content="""
Hi Joey, thanks for the quick reply. After I filed an issue I did realize as well that it must be difference in libmagic1 library version and change in its behavior (!) and indeed it is the case here:
[[!format sh \"\"\"
$> apt-cache policy libmagic1
libmagic1:
Installed: 1:5.30-1+deb9u2
Candidate: 1:5.30-1+deb9u2
Version table:
*** 1:5.30-1+deb9u2 100
100 http://debian.csail.mit.edu/debian stretch/main amd64 Packages
100 /var/lib/dpkg/status
1:5.30-1+deb9u1 100
100 http://security.debian.org stretch/updates/main amd64 Packages
$> file -L --mime BADFILE.txt
BADFILE.txt: text/plain; charset=utf-8
$> LD_PRELOAD=/usr/lib/git-annex.linux/usr/lib/x86_64-linux-gnu/libmagic.so.1 file -L --mime BADFILE.txt
file: compiled magic version [530] does not match with shared library magic version [535]
BADFILE.txt: application/json; charset=utf-8
\"\"\"]]
Standalone I believe was built on buster.
I am not sure how I didn't run into this change of behavior before!
Possibly interesting observation: making it a not kosher json (adding trailing , into one field etc) makes it being detected as text again
But altogether, with this new behavior of libmagic, it begs a new question:
**how (without listing all possible text based file formats) we could instruct annex to treat them as text files?**
There is an `-e` option to `file` to exclude some tests, and I thought excluding `apptype` would help, but it does not
[[!format sh \"\"\"
-e, --exclude TEST exclude TEST from the list of test to be
performed for file. Valid tests are:
apptype, ascii, cdf, compress, elf, encoding,
soft, tar, json, text, tokens
$> file -e apptype --mime BADFILE.txt
BADFILE.txt: application/json; charset=utf-8
\"\"\"]]
There is also `-k, --keep-going don't stop at the first match` so I thought it might list some `text/` but I am not sure what it does:
[[!format sh \"\"\"
$> file -k --mime BADFILE.txt
BADFILE.txt: application/json\012- \012- ; charset=utf-8
$> python -c 'import magic; m=magic.Magic(mime=True, keep_going=True); print(m.from_file(\"BADFILE.txt\"))'
application/json\012- \012-
\"\"\"]]
but this might be just a bug in libmagic...?
"""]]