Search in Greek Works Now
Hacked CJKSplitter to work with Greek in utf-8
Zope splitters are: a.) used to split text into words to make them indexable and b.) seemingly deep magic because documentation is nonexistant (or I couldn't find any). What I ended up with was the sourcecode to CJKSplitter.
So I hacked CJKSplitter into doing Greek. It was not especially difficult, mostly a "delete" job (chinese is way more complex). Also I replaced the range of chinese unicode characters with the range for Greek. Might add iso-8859-1 character ranges too, especially for this blog.
Once I clean up the code, have a look at the license and ask on the mailing list if this is really the proper way to go, I will put this online. It might be a good solution for Greek in Zope with ZCTextIndex.
But in short: The result so far is that searching in Greek should now work in the search form on this weblog.
ch athens
Life in Athens (Greece) for a foreigner from the other side of the mountains.
And with an interest in digital life and the feeling of change in a big city.
Multilingual English - German - Greek.
Main blog page
Recent Entries
Best of
Some of the most sought after posts, judging from access logs and search engine queries.
Apple & Macintosh:
Security & Privacy:
Misc technical:
Athens for tourists and visitors:
Life in general:
There are no comments.
You can trackback to: http://betabug.ch/blogs/ch-athens/90/tbping
Searching ZCTextIndex in Greek, properly
A long time ago I had made my own Greek Unicode splitter for ZCTextIndex. That worked fine, but
it didn't take the pronunciation marks into consideration (so searching for
"ελληνικα" didn't find "ελληνικά"). Today I found through the greek plone
foru...