Looks like http://code.google.com/p/cjk-tokenizer is more or less the only option out there, but that it depends on libunicode which is not installed by default, let alone even in the Ubuntu archives :-/
We may be able to replace the libunicode bits with libicu bits (since libicu44 is installed by default as a dep of webkit)
Looks like http:// code.google. com/p/cjk- tokenizer is more or less the only option out there, but that it depends on libunicode which is not installed by default, let alone even in the Ubuntu archives :-/
We may be able to replace the libunicode bits with libicu bits (since libicu44 is installed by default as a dep of webkit)