New approach - find split points based on Unicode categories.