Tuesday, May 03, 2005

Fun with CTXCAT

Oracle provides a full text indexing type called CTXCAT, which can be thought of as a low-maintenance full text index. You don't have to worry about synchronizing, which makes updates happen in real time. It is best used for things like document title searches, product names, and the like -- the syntax is easy, but you have very little power or control to do the sexy, cool full text search that a Context index gives you.

Except -- you can actually embed a Context 'CONTAINS' clause within a CTXCAT 'CATSEARCH' clause and get the full power [well, a good chunk of it anyway]. The syntax is a little painful, but very repeatable. As an example...

Contains clause to do a fuzzy search on the word 'column' in the doc.text column:

select docid, score(1), title from doc
where contains (text, '?column', 1) > 0
order by score(1) desc;

You cannot duplicate this with a standard CATSEARCH clause. However, using a query template you can extend the catsearch clause to perform a contains search (substitute < for [ because I can't figure out how to tell it to turn off HTML processing) :

select docid, title from doc
where catsearch(title,
'[query>
[textquery grammar="context"> ?column
[score datatype="integer"/>

[/query>','') > 0;

Viola! Fuzzy searching on a CTXCAT index.

The Full Swing

Golf is best thought of as the act of trying to smack a small white ball a quarter of a mile into a four-inch cup, three tries is good, four reasonable, and anything more unacceptable.

I would really like to try that again soon, but won't be holding my breath.