The Problem with MySQL FULLTEXT Indices

Ok, know that I have discovered the beauty of FULLTEXT indices, I know have discovered their dark side: even set for "IN BOOLEAN MODE", I get way more results than I was expecting.

Using the logic of the FULLTEXT search, it makes total sense. If I pass "/compression/" as a variable to the listing page, it returns everything that matches that string, including the sub-pages.

Time to haul out a regular expression. Using PHP’s ereg function, I perform a case insensitive match on the string.

if (ereg("^$page$","$row[DATABASE_URL]")) {
        …
}

So I am skipping all of the overzealous returns, and displaying only the exact matches. And, believe it or not, this process is still far faster than using the REGEX engine in the MySQL query to start off with the exact matches.

If anyone has a less brain-damaged way to get MySQL to quickly return exact string matches, let me know.


Addendum: I found yet another brain-damaged thing with FULLTEXT searches: if you search to match against "/", exactly zero results are returned, even with "IN BOOLEAN MODE" turned off. This baffled me for a while, and is important enough for me to waste some mental energy on, as all requests for the root document in an Apache log file are referenced as "/".

After trying a few things, I actually had to write a corner-case "if" statement, that, if the passed URL exactly matches "/^/$/", use the REGEX method to pull the pages. All other pages are processed using FULLTEXT matching.

This behaviour is very weird.

Leave a Reply

Your email address will not be published. Required fields are marked *