MySQL Fulltext search

There are several ways to search for something in a MySql database. One of these techniques is the use of the LIKE clause.SELECT field1, field2 FROM Table_Name WHERE field2 LIKE ‘%search_string%’

While this technique works like a charm and is actually advisable to some scenarios, it is however cumbersome to use when you have thousands of records in the table and you have to sort the results according to the relevancy of the resulting record to the search string. This technique does not scale at all. This is where FULL-TEXT SEARCHING comes to play.

MORE ON FULL-TEXT…

Full-Text is a type of index created on a table. The Full-Text index basically creates an index of all of the words of the field specified in your table. Each entry in this index references a row in your table. These indexes are created in separate index files by MySQL. When you perform a search against the Full-Text index, the MySQL Full-Text search functions use this index to return the related rows from the table back through the result set. Through this index, the search performance is enhanced through its speed in returning the results and could also give you the ability to sort the results according to its relevance.

SETTING UP THE DATABASE…

As an example let us use the table with the structure defined below:

CREATE TABLE `Table_Name` (
`field1` INT( 10 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`field2` VARCHAR( 100 ) NOT NULL
);

The field where we will match our search queries against is the string field2. to create the Full-Text index for field2 use the sql statement below:

ALTER TABLE Table_Name ADD FULLTEXT (field2);

THE QUERY…

With the database set, let us move on the query statement that would be used to return the resulting records from the search”.

SELECT * FROM Table_Name WHERE MATCH(field2) against(`search string`);

Full-Text searching uses the MATCH() and AGAINST() syntax. As parameters, MATCH() takes the list of all the field names from where we will “match” the search string into. While AGAINST() takes the search string. The search string can contain any phrase that you would like to search in database. Stopwords like `else`, `any`, `at`, `as` would not be considered by the full-text search. For a list of all the stopwords visit http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html

THE RESULT…

If a certain word in the phrase returns 50% or more of the records on the table, it will be disregarded. Though this may seem annoying with tables containing small number of records, this rule actually makes sense with a large database because if that word is contained in most records it is mostly irrelevant. For example, you have a database of Cars, and a user would search `Toyota cars` since all of your records contain the word `cars`, the relevant word would only be `Toyota`. The user would more often than not, need results that would contain `Toyota`.

MORE ON RESULTS…

As I have mentioned, we would most likely want to order the results according to the relevancy to the search phrase. To do this,

SELECT field1, field2, MATCH(field2) against(`search string`) AS relevancy FROM Table_Name WHERE MATCH(field2) against(`search string`) order by relevancy DESC;

As you can see, we used the MATCH() AGAINST() phrase to have a new field in our results as relevancy and sets the order of the returned results to that relevancy field. MySql acknowledges this technique and will perform the MATCH() AGAINST() phrase only once.

CONCLUSION

Give it a try and you will see that this technique will scale well in performance compared to creating a query using the LIKE clause and scan through the result to compute each records` relevancy score. Full-Text Search does not stop here, you could also perform Boolean searching and query expansions.


MySQL

Leave a Reply