Update on ScienceSeeker's search functionality

ScienceSeeker's full-text search has been offline since October. As its downtime extends over months and I continue to struggle to put in the new search engine, I thought I'd explain to the community what's going on.

In October, the site suddenly crashed, and I could not get it back up. Poking and prodding turned up the information that the problem was database access. We keep all of the information about blogs that we index in a local database, and we weren't able to connect to it. But why not?

I managed to enlist the help of a database expert, who figured out that some of the database queries we were using were extremely inefficient. As more and more slow queries backed up, the system became more and more overloaded. Some queries were running for hundreds of minutes! (I expect none of the users who originated those queries had stuck around to see the results.)

The database expert helped me rewrite the worst offenders, and the site started running smoothly again. But the one query he could not speed up enough was our full-text search query. He looked at it and said, "Why aren't you using a real search engine? This query is never going to be able to do what you want."

A search engine. Why hadn't I thought of it before? There is actually a really excellent free search engine available, Solr. But the down side of search engines is that, by dint of being much more powerful than single database queries, they are also more complicated to run. Over the past few months, I have been trying to snatch a few hours a week to integrate Solr with the ScienceSeeker code, between managing other bugs which appear in the meantime.

And that's where we stand now: I'm working on it. Hopefully we'll get search back up soon! In the meantime, we appreciate your patience.

Jessica Hekman
Technical Director and Project Manager, ScienceSeeker

No comments:

Post a Comment