Today I fixed up the indexer in a few ways:
1) The thumbnail code now crops the image into a 100px square. This helps align the search results and really improves the look of the site. The cropper centers around the upper-central-third of the photo, so it should include the face and tits of most models.
2) The summary-extracting code has been vastly improved and tested on a slew of galleries; it now does a much better job of skipping ads and recip links and pulling the first sentence or two of the gallery description (assuming one exists).
3) I caught a bug in the indexer which was stripping numbers from the reverse index, so searching for “18 year old teens” would only find results for “year old teens”. Oops!
4) The list of stop words has been extended to include the most common words which doesn’t really apply to gallery descriptions, stuff like the 2257 links at the bottom of galleries, HTML link codes, etc. This keeps the reverse index leaner, which leads to much faster search times.
With all of these changes, I’m going to clear out the existing database and start spidering from scratch. Annoying, I know, but EveKnows is still in beta ;)
Also, for any gallery submitters out there, I’ve added a Submissions page so that you can queue up your own galleries to be indexed. I’d appreciate a recip link on any galleries submitted, but it’s not required and it won’t affect your search ranking in any way.

