f9f61fca86
Follow up from #440 - Convert curly single quotes to straight quotes before inserting - Add General and Supplemental Unicode punctuation ranges to getClass() (so that fancy punctuation doesn't end up in words) - Move single-quote test from getClass() to semanticSplitter(), and consider it a letter only if in the middle of a word - Add comments to semanticSplitter() This might be ever-so-slightly slower, but it's neglible. (War and Peace seems to now take ~1570ms instead of ~1500ms for me.) |
||
---|---|---|
.. | ||
content | ||
locale | ||
skin/default/zotero |