Archive for September, 2007

Profiling and Debugging Linux Disk Access

September 27th, 2007

EveKnows.com is 100% Linux powered. The free (as in speech!) system has proved to be absolutely perfect for our needs. It’s fast, stable, and customizable–exactly what you look for in a platform for running fresh, cutting-edge applications such as EveKnows.

One of the harder tasks I’ve had is tuning disk access. The search engine is currently running on a Debian 4.0 system with SATA hard drives. The UNIX utility top reports 10-20% IO usage (which is a good indicator of disk access) almost all the time. When I turn on the Caroline search spider, that usage spikes to 50%. At the moment this isn’t really a big deal, but as the site’s popularity continues to grow, it will eventually become a bottleneck and severely limit performance.

Thus, I’ve been trying to learn about profiling disk access on Linux systems. Maybe I’ve just been looking in the wrong places, but I haven’t been able to find any tools which can show me which applications are causing the heavy IO load. Some digging revealed that dmesg can report individual IO calls when /proc/sys/vm/block_dump is set to 1, but that raw information is essentially useless. To that end, I wrote a small Perl script which totals all of the IO statistics and displays a pretty table of results. If anyone is interested in using it themselves, the code is below.

Update: HTML tends to screw up Perl code, so copying/pasting the below code probably won’t work; if you just want to download the script for your own use, you can find it here.

#!/usr/bin/perl
#
# Copyright 2007 Aidan Trent 
# Released under the terms of the GNU GPL                                                                                                                                                                             

# Usage: SCRIPT_NAME 

EveKnows 1.1

September 9th, 2007

Today EveKnows was upgraded to version 1.1. Changes include:

  • Pop-up menus to allow more data to displayed for each search result
  • Find Similar Sets command, which executes a search for similar galleries
  • View More from this Site command, which displays every gallery in the EveKnows database from the same site
  • Miscellaneous user interface fixes

Indexing Progress

September 7th, 2007

The EveKnows.com database is now approaching 600,000 unique porn galleries. The vast majority of these have been pulled by our own web spider, Caroline. I’d like to invite every TGP/MGP owner to submit their own sites for indexing. Within one week your galleries will begin to show up in EveKnows.com’s search results, resulting in more traffic for your site. Caroline will also regularly re-index your site to pick up any new galleries you’ve posted.