How to Run a Rake Task via Cron Under RVM

Cron needs to be explicitly told which RVM to use, the easiest way to do that is to pretend to open up a bash shell (as all the pre-reqs are loaded by default).

Save your afternoon and reuse the following in your crontab:

/bin/bash -l -c 'cd PATH/TO/RAILS/APP && $HOME/.rvm/gems/THE_RVM_RUBY_TO_USE/bin/rake RAILS_ENV=production THE_TAKS:TO_RUN --trace'

for me, it looks like this:

*/5 * * * * /bin/bash -l -c 'cd /srv/www/app/ && /home/garrick/.rvm/gems/ruby-1.8.7-p370@global/bin/rake RAILS_ENV=production cron:parse --trace'

huge thanks to Edgar Gonzalez, Hector Castro, Murtada Shah

Rails’ Page Caching with Multiple Formats in respond_to

One of the apps I’m working on was using Rails’ caches_page to speed up performance and reduce server load.

It worked great until we added new formats via within the respond_to block. If there wasn’t a format specified in the request – the caches would get all messed up. Well sometimes. Making it a difficult issue isolate.

I found lots of articles on caching 1 format and not another – even Rails’ documentation illustrates how to do that.

But I wanted to cache all the formats.

The most reliably way I’ve been able to accomplish this is by explicitly declaring the format-specific location for the caches. Like this:


respond_to do |format|
format.html {
render :layout => 'application', :content_type => "text/html"
cache_page(@response, "/index.html")
}


format.rss {
render :layout => false, :content_type => "application/rss+xml"
cache_page(@response, "/index.rss")
}


format.iphone {
render :layout => 'iphone', :content_type => "text/html"
cache_page(@response, "/index.iphone")
}
end

Note the render lines, they seem to be required for the @response to properly populate.

This solves the correct creation of the format-specific page caching, there’s still the issue of the web server (Nginx, Apache) knowing which cached file to serve.

Whatever checks Rails’ is doing (i.e. Are you an iPhone?, etc) to generate the correct cache need to be duplicated in the web server and mapped to the corresponding cache file.

Personally, this seems like a overly-complex solution for an issue I hadn’t assumed would exist. What am I missing?

Subdomain_fu and Destroying Authlogic Sessions with link_to_remote

I’ve been fighting with a bug that doesn’t properly sign a person out from a subdomain (using a combination of Authlogic and Subdomain_fu)

The sign out link looks like this:

< %= link_to_remote "Sign Out", :url => {:controller => 'person_sessions', :action => 'destroy', :subdomain => false}, :method => :delete -%>

While the session was destroyed, the AJAX updates confirming it – were not being made. Turns out, the cross-subdomain AJAX request was the culprit. To fix it, remove the :subdomain declaration.

< %= link_to_remote "Sign Out", :url => {:controller => 'person_sessions', :action => 'destroy'}, :method => :delete -%>

As you were.

auth via params API Access with Authlogic

Authlogic, my current favorite Ruby-based authentication library and I were in a fight the last couple of days.

I was trying to add token-based, auth_via_params, authentication (vs. login and password) to a project – but Authlogic and I weren’t agreeing on how it should be done.

I had assumed:

@person_session = PersonSession.new(single_access_token => params[:token] )
@person_session.save

Instead Authlogic wanted me to give it a Person first.

@person_session = PersonSession.new(Person.find_by_single_access_token(params[:token]))
@person_session.save

Rails Cookie Settings for Cross-Subdomain Sessions

For the past day, I’ve been tracking down a hair-pulling-ly frustrating bug in Rails ( with Authlogic on Passenger).

My sessions weren’t sticking in production

Cross-domain or otherwise (doubly frustrating because a) Authlogic has been so rock solid for me otherwise, b) worked as expected in development).

Turns out, I wasn’t setting the session domain correctly in environments/production.rb.

config.action_controller.session[:domain] = '.YOURDOMAIN.COM'

Note the dot (it’s there for subdomains). Oh, and be sure to correctly spell your domain name…or sessions won’t work at all. ๐Ÿ˜‰

RealTimeAds.com Launches at MinnPost.com

MinnPost RealTimeAds

I’m pleased to announce the launch of RealTimeAds.com – a advertising product now in beta testing at MinnPost.com

Karl and I have been building and testing the system for a couple of months now and I’m quite happy with it on three of fronts;

  • It feels like it makes advertising approachable to people and organizations that haven’t considered it within reach before. Especially, extremely small and locallly-focused people.
  • It re-frames publications that already exist (Twitter feeds, blog feeds, etc) as text advertisements, cuz, you know, that’s what they are anyway.
  • It extends the real-time nature of Twitter outside of the Twitter silo, helping those people and organizations to get more mileage out of their tweets.

“Real-Time Ads runs on RSS. So, you use what you are already using-Twitter, Blogger, Tumblr, proprietary CMS, whatevs!” – Karl Pearson-Cater

Interested in trying it out? Give MinnPost a call: 612 455 6953.

Yes, the RealTimeAds.com system uses a version of Cullect’s engine tuned for ad serving (verses feed reading).

For those of you following along, RealTimeAds.com is Secret Project 09Q02A.

UPDATE:
Here’s the official RealTimeAds announcement from MinnPost’s Joel Kramer

“Imagine a restaurant that can post its daily lunch special in the morning and then its dinner special in the afternoon. Or a sports team that can keep you up-to-date on its games and other team news. Or a store that could offer a coupon good only for today. Or a performance venue that can let you know whether tickets are available for tonight. Or a publisher or blogger who gives you his or her latest headline. ” – Joel Kramer, MinnPost

UPDATE 2: More from Joel Kramer, this time talking to the Nieman Foundation for Journalism at Harvard.

“We do believe Real-Time Ads will prove more valuable for advertising at a lower entry point.”

How To Cache Highly Dynamic Data in Rails with Memcache – Part 3

In part 1 and part 2, I laid out my initial approaches on caching and performance in Cullect.

While both of them pointed in the right direction, I realized I was caching the wrong stuff in the wrong way.

Since then, I tried replicating the database – one db for writes, one for reads. Unfortunately, they got terribly out of sync just making things worse. I turned off the 2nd database and replaced it with another pack of mongrels (a much more satisfying use of the server anyway).

Cullect has 2 very database intensive processes: grabbing the items within a given reading list (of which calculating ‘importance’ is the most intensive) and parsing the feeds.

Both cause problems for the opposite reasons – the former is read and sort intensive while the latter is write intensive. Both can grind the website itself to a halt.

Over the last couple weeks, I moved all the intensive tasks to a queue processed by Delayed_Job and I’m caching the reading lists’ items in database table – rather than memcache.

Yes, this means every request is pulled from the cache, and a ‘update reading list’ job is put into the queue.

So far, this ‘stale while update’ approach is working far better than the previous approaches.

BTW, as this practice confirms, the easiest way to increase the performance of your Ruby-based app is to give your database more resources.

How To Cache Highly Dynamic Data in Rails with Memcache – Part 2

In my part 1, I laid out my initial approach on caching in Cullect.

It had some obvious deficiencies;

  1. This approach really only sped up the latest 20 (or so items). Fine if those items don’t change frequently (i.e. /important vs /latest) or you only want the first 20 items (not the second 20),
  2. The hardest, most expensive database queries weren’t being cached effectively. (um, yes that’s kinda the purpose).

I just launched a second approach. It dramatically simplified, cache key (6 attributes down from 10) and rather than caching entire the items in that key, I just stored the pointer object to them.

Unfortunately, even the collection of pointer objects was too big to store in the cache, so I tried a combination of putting a LIMIT on the database query and trying to store 20 items a time in a different cache object.
This second approach had the additional problem of continually presenting hidden/removed items ( there’s 2 layers of caching that need to be updated).

Neither was a satisfactory performance improvement.

I’ve just launched a solution I’m pretty happy with and seems to be working (the cache store is updating as I write this).

Each Cullect.com reading list has 4 primary caches – important, latest, recommended, hidden – with a variants for filters in keyword searches. Each of these primary caches is a string containing the IDs of all the items in that view. Not the items themselves, or any derivative objects – both of those take up too much space.

When items are hidden, they’re removed from the appropriate cache strings as well.

Murphy willing, there won’t be a part 3. ๐Ÿ˜‰

1. Storing the items in the cache as objects