This is an update to earlier post on Performance and MySQL Indexes.
While 10-12 seconds per feed is an improvement, it’s completely unacceptable when we’re talking 2,000+ feeds (full parse would take 5.5 hours) and slow down the system to unusable.
In an attempt to find the bottlenecks, I loaded up the query_analyzer plugin on development and opened up a terminal window to watch the processes in production via
Both showed me the database – not the Rails app itself – was the slow one2.
I was able to get the average per feed parse time under 5 seconds by doing two things:
- re-writing the straight
find_by_sqlqueries to use
JOIN ... ON ...rather than the clumsy
...WHERE a.id = b.a_id AND b.id = c.b_id AND etc.... Why not use Rails’
findhelpers? I wasn’t able to track down how to
- Compare the feed’s
last-modifieddate against the
last-checkeddate in the database prior to running it through the parser.
Both of them are ‘duuuuh’ improvements, easily cutting the CPU burden of my database process in a third, often down to barely noticeable.
UPDATE 2 Dec 2007
Continued refactoring now has the average feed parse time around 3 seconds. Yes, that means in some instances I’m seeing 5+ feeds / second.
1. This monitoring process is now my biggest burden. 🙂
2. I’m sure any reasonably experienced Rails developer says this throughout the day.