More tweaks; notably try to insert paragraph breaks rather than a separate Python tuple when re-concatenating strings.
Minor tweaks for better numeric-following-split-strings.
New approach - find split points based on Unicode categories.
Category-based splitting.
Calculate sortemes using simply alnum splitting rather than word breaks. Faster and slightly more accurate for our purposes. Strip punctuation.
'Advanced' sorteme functions.
_strings: Numeric string extraction routines.