Archive for the ‘Hardware’ category

Profiling and Debugging Linux Disk Access

September 27th, 2007

EveKnows.com is 100% Linux powered. The free (as in speech!) system has proved to be absolutely perfect for our needs. It’s fast, stable, and customizable–exactly what you look for in a platform for running fresh, cutting-edge applications such as EveKnows.

One of the harder tasks I’ve had is tuning disk access. The search engine is currently running on a Debian 4.0 system with SATA hard drives. The UNIX utility top reports 10-20% IO usage (which is a good indicator of disk access) almost all the time. When I turn on the Caroline search spider, that usage spikes to 50%. At the moment this isn’t really a big deal, but as the site’s popularity continues to grow, it will eventually become a bottleneck and severely limit performance.

Thus, I’ve been trying to learn about profiling disk access on Linux systems. Maybe I’ve just been looking in the wrong places, but I haven’t been able to find any tools which can show me which applications are causing the heavy IO load. Some digging revealed that dmesg can report individual IO calls when /proc/sys/vm/block_dump is set to 1, but that raw information is essentially useless. To that end, I wrote a small Perl script which totals all of the IO statistics and displays a pretty table of results. If anyone is interested in using it themselves, the code is below.

Update: HTML tends to screw up Perl code, so copying/pasting the below code probably won’t work; if you just want to download the script for your own use, you can find it here.

#!/usr/bin/perl
#
# Copyright 2007 Aidan Trent 
# Released under the terms of the GNU GPL                                                                                                                                                                             

# Usage: SCRIPT_NAME 

Hardware

July 9th, 2007

One of the most frequent questions I’m asked is, “What sort of servers are required to run a porn search engine?”. Everyone seems a bit surprised by my answer: thus far, nothing special. From EveKnows.com’s testing launch in March until early July, in fact, the site ran on a single Athlon XP 2200 machine with 1GB of RAM and a standard IDE hard disk. Nothing special indeed. That configuration hasn’t had any trouble handling 60,000 page-views daily. Of course, I’m running EveKnows on a highly-tuned Linux server with custom-written software; everything has been optimized to make the most of the available resources, but the point still stands: it doesn’t take much hardware to handle the current traffic load.

Last week I finally upgraded to a dual-core Athlon 64 X2 4200 with a SATA hard drive. The new site design will go live sometime this week, marking the release of EveKnows.com 1.0. With luck, I’ll get a little bit of publicity and increase the daily traffic. The extra power is designed to insure against any possible spikes; I’d hate to get an influx of new visitors and watch the server melt under the load. At the moment, though, the load is sitting between 0.03 and 0.08. At least I can recompile kernels faster than ever before ;)

Also, I’ve added a Hardware category to this blog. I’ll keep everyone up to date on how the new server performs and how things scale as the site increases in popularity.