X-Git-Url: https://git.korewanetadesu.com/?p=python-collate.git;a=blobdiff_plain;f=README.md;fp=README.md;h=9b7adef507fee7f2827baa9347b125f010f694cd;hp=0000000000000000000000000000000000000000;hb=35bf46766ebe8d702cb4440545c8b7e8f7e13363;hpb=b278dcabc282c5faa070a72c2e7fd915597ccd00 diff --git a/README.md b/README.md new file mode 100644 index 0000000..9b7adef --- /dev/null +++ b/README.md @@ -0,0 +1,90 @@ +# Collation algorithms for Python +------------------------------------------- + +pycollate is an interface to various collation algorithms for Python. + +Supported backends: +- `icu` - Based on the IBM ICU toolkit and Jim Fulton's zope.ucol. +- `syslocale` - Native OS collation routines. +- `codepoint` - Raw Unicode codepoint comparison + +If available, you'll probably want to use the ICU backend. If it's not +available, syslocale should work on most Python installations. A +specific backend can be used, or a "best" backend is chosen by +default. + +pycollate also provides tools to perform word-wise and numeric sorts. + +pycollate, as with all Unicode collation tools, is a work in progress. + +## Installing + + $ sudo apt-get install python-pyrex libicu-dev + $ ./setup.py build + $ sudo ./setup.py install + +## Example + + import collate + strings = open("contents.txt").read().decode("utf-8").splitlines() + strings.sort(key=collate.key) + +## FAQ + +### What's collation? + +Collation is the process of sorting information in a useful way. In +particular, this module sorts strings in a way that humans might +expect to read them. + +### What's so hard about that? + +Nothing, if your strings are all in one language and you speak English +yourself. + +On the other hand, if that's not the case you need to make sure "ss" +and "ß" sort similarly, "å" sorts like "A" (unless you're Swedish), +and "21 Monkeys" comes after "3 Monkeys". + +### How fast is the library? + +Slow enough that you will probably want to cache sort keys. On a +mid-range system at the time of its writing, it takes about half a +second to sort 10000 song titles. + + +## License + +### icu/_icu.pyx + +Copyright (c) 2004 Zope Corporation and Contributors. +All Rights Reserved. + +This software is subject to the provisions of the Zope Public License, +Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. +THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED +WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS +FOR A PARTICULAR PURPOSE. + +### All else + +Copyright 2010 Joe Wreschnig + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE.