This is an update to earlier post on Performance and MySQL Indexes.
While 10-12 seconds per feed is an improvement, it’s completely unacceptable when we’re talking 2,000+ feeds (full parse would take 5.5 hours) and slow down the system to unusable.
In an attempt to find the bottlenecks, I loaded up the query_analyzer plugin on development and opened up a terminal window to watch the processes in production via prstat -Z
1.
Both showed me the database – not the Rails app itself – was the slow one2.
I was able to get the average per feed parse time under 5 seconds by doing two things:
- re-writing the straight
find_by_sql
queries to useJOIN ... ON ...
rather than the clumsy...WHERE a.id = b.a_id AND b.id = c.b_id AND etc...
. Why not use Rails’find
helpers? I wasn’t able to track down how to:include
multiple tables/models - Compare the feed’s
last-modified
date against thelast-checked
date in the database prior to running it through the parser.
Both of them are ‘duuuuh’ improvements, easily cutting the CPU burden of my database process in a third, often down to barely noticeable.
UPDATE 2 Dec 2007
Continued refactoring now has the average feed parse time around 3 seconds. Yes, that means in some instances I’m seeing 5+ feeds / second.
1. This monitoring process is now my biggest burden. 🙂
2. I’m sure any reasonably experienced Rails developer says this throughout the day.
I’d offer to track down my book on MySQL optimization, but I don’t know how relevant it would be anymore. Good book, though.