After a day of messing around with SQL queries, I’ve finally got a handle on doing a logical AND search against the reverse index. Previously the search terms were OR’ed together, so searching for Liz Vicious would return any gallery that matched the word Liz and then every gallery containing the word Vicious. Turns out that there’s a lot of stuff with the word Vicious in it that has absolutely nothing to do with the sexy goth redhead I was looking for, so I knew the searching algorithm needed work. The trick is in SQL’s HAVING clause; the Eve engine does a COUNT(*) on the returned results, which are grouped by URL. The result of the COUNT(*) function is the number of matching terms; a quick HAVING COUNT(*)=$n_terms line in the SQL SELECT statement cleaned up the mess.
I also made some changes to the spider so that it pulls search terms from incoming links; this should help improve search quality, but the change means the existing gallery database is worthless. I’ve scrapped it and started spidering from scratch. In a day or two I’ll take whatever’s been spidered and move it to the production server at EveKnows.com. Stay tuned for some quality porn searches!