I started building up new project today, one of the 2 initial revenue generating

I started building up new project today, one of the 2 initial revenue generating projects on my 2009 list. While it’s a way from launching, much of the heavy lifting was completed today. Conceptually, I’ve been using a proof-of-concept of this project for a couple years now. Oh, and I spent waaaay to long looking for domain names for it. The Code Name thus far has been ‘Cashboard’ – but since it’s not available, it needs to be changed.

How To Cache Highly Dynamic Data in Rails with Memcache – Part 1

There are a number of ways increase Ruby on Rails performance through caching. Caching works because things don’t change….or don’t change frequently.

In Cullect, almost everything is dynamic, even Cullect’s HTML presentation format has 3 different states depending on access privileges and there are 8 other presentation formats available.

The standard page, action, and fragment caching make less sense when the ‘heaviest’ data are also the most dynamic – the feed items.

For the feed items, I’m using a building a custom memcached name-value-pair holding 10 attributes describing the request as the key name and the items themselves as the key value.

From the ‘show’ action in my controller:

key = "
#{item_count}:
#{attribute_1}:
#{attribute_2}:
#{attribute_3}:
#{attribute_4}:
#{etc...}"

Notice the first attribute in the key is the total number of items within a specific Cullect Reading List – which will change when a feed updates or an item is hidden – automatically expiring stale caches.

Then, I check the cache for the key and pull the items from the cache if it exists.

if Cache.get(key)
@allitems = Cache.get(key)

If there’s no key in the cache, I do the query and put the retrieved items into the cache.

else
@allitems = get_items(attribute_1, attribute_2, attribute_3, attribute_4, etc)
Cache.put key, @allitems
end

While this speeds up subsequent requests, there’s still the question of speeding up the initial request. I’ll save that for part 2.

Everything Else You Need to Run a Web App

As a reminder on perspective and complexity, here’s the ongoing list of systems I’ve configured for Cullect that aren’t Ruby on Rails:

  • DNS
  • Web Server: robots.txt
  • Web Server: .htaccess
  • Web Server: Virtual Domains
  • Database Server: MySQL
  • Caching Servers: Memcached, distributed across 2 boxes

And this doesn’t even crack the surface of the systems Joyent’s Jason Hoffman lists out in his Scaling Rails from the Bottom Up presentation.

A little perspective setting is always good.

There’s a network of systems all interconnected. When one of them isn’t working right, a completely other system could be the culprit. A system you’re not an expert at. A system you need a new hammer for.

How To: Build a Wiki with Ruby on Rails – Part 2

Back in Part 1 of Building a Wiki with Ruby on Rails, we built the core wiki engine.

Time to add some syntax formatting.

These days, I’m pretty enamored of Haml, it’s more like HTML (unlike many other formatting engines) and it’s fast to write (unlike many other formatting engines).

1. Install Haml

Haml installs as a gem…1
gem install --no-ri haml
and a plugin…
haml --rails path/to/your/wiki

2. Create Your Haml View

Open up app/views/show.html.erb
and convert it to Haml. Replacing it to:

.content
%h1= @page.title.gsub('_', ' ')
= link_to('Edit', page_edit_path(:page_title => @page.title))
= link_to('History', page_history_path(:page_title => @page.title))
= "Last Updated #{time_ago_in_words(@page.revisions.last.created_on)} ago by #{Person.find(@page.revisions.last.person_id).name}"
%p= auto_link(@page.revisions.last.parsed_contents)

Now, your wiki should look exactly like it did before we plugged in Haml.

3. Add Haml Support for Revisions

Open up app/models/revision.rb and update def parsed_contents with:

def parsed_contents
contents = contents.gsub(/(w*_w*)/) {|match| "<a href='#{match.downcase}'>#{match.gsub('_', ' ')}</a>"}
h = Haml::Engine.new(contents)
h.to_html
end

4. Try It Out

Load up a page in your wiki, click ‘edit’, and add in some simple Haml, something like:

%strong this should be bold

5. Add in Some More (non-Haml) Formatting

Once you’re comfortable with Haml, maybe add in some other simple formatting rules into revision.rb

#Auto-renders any image URL
contents = self.contents.gsub(/s(.*.(jpg|gif|png))s/, '<img src="1"/>')

# Bolds text inside asterisks
contents = contents.gsub(/*(.*)*/, '<strong>1</strong>')

# Wraps <code> around text inside '@'
contents = contents.gsub(/@(.*)@/, '<code>1</code>')

# Blockquotes text inside double-quotes
contents = contents.gsub(/s"(.*)"s/, '<blockquote>"1"</blockquote>')

# Italicizes text inside underscores
contents = contents.gsub(/s_(.*)_s/, '<em>1</em>')

6. Make it Better

After playing around with Haml and other formatting, you’ll quickly see some quirks – some room for improvement. Go for it, and let me know when you get something interesting.

1. Remember to vendor your gems.

How To: Build a Wiki with Ruby on Rails – Part 1

Here’s how to build a very simple Ruby on Rails-based wiki engine 1.

1. Create the Rails App

rails wiki
and cd wiki to get inside the app.
Remember: create your database (i.e. wiki_development) and update config/database.ymlappropriately

2. Generate the Scaffolding

There are 3 nouns in this wiki system: People, Pages, Revisions. Let’s create the models for each of them.
run script/generate scaffold Person
Now open up wiki/db/migrate/*_create_people.rb and add:
t.column "name", :string inside the create_table block to store the Person’s name.

Add Person.create :name => "First Person"
after the create_table block to give us an initial Person.

run script/generate scaffold Page
Now open up wiki/db/migrate/*_create_pages.rb and add:
t.column "title", :string inside the create_table to store the Page’s title and add
Page.create :title => "Home_Page" after the create_table block to give us an initial Page.

run script/generate scaffold Revision
Now open up wiki/db/migrate/*_create_revisions.rb and add:

# Reference to the Person that created the Revision
t.column "person_id", :integer

# The contents of the Revision
t.column "contents", :text

# Reference to the Page this Revision belongs to
t.column "page_id", :integer

3. Run the database migrations

rake db:migrate

4. Describe the relationships between the Models

People should be able to create many Revisions.
Open /app/models/person.rb and add – after the first line:
has_many :revisions

Pages should have many Revisions
Open /app/models/page.rb and add – after the first line:
has_many :revisions

Revisions should reflect they belong to both Pages and People
Open /app/models/revision.rb and add – after the first line:

belongs_to :page
belongs_to :person

5. Draw up the Routes

Open up config/routes.rb and update them to handle wiki-fied words. Copy the following after the map.resources block

map.root :controller => 'pages', :action => 'show', :id => 1

map.connect 'pages.:format', :controller => 'pages', :action => 'index'
map.connect 'revisions.:format', :controller => 'revisions', :action => 'index'

map.page_base ':title', :controller => 'pages', :action => 'show'
map.connect ':title.:format', :controller => 'pages', :action => 'show'

map.page_history ':title/history', :controller => 'pages', :action => 'history'
map.connect ':title/history.:format', :controller => 'pages', :action => 'history'

map.page_edit ':title/edit', :controller => 'pages', :action => 'edit'

Now run script/server and load up a browser.
If things are working correctly, you should see simply ‘Edit | Back’ as links

6. Create the Views

Open /app/views/revisions/new.html.erb and add

< %= f.text_area :contents %>

< % fields_for :person do |p| %>


< %= p.text_field :name %>

< % end %>

right after < %= f.error_messages %> for the Revision’s content, the corresponding Page and the author’s name
Loading up http://0.0.0.0:3000/revisions/new should give you the form with the text area, a text field, and a submit button you just created.

Try entering something into the form and click ‘Create’. You should see ‘Revision was successfully created.’ Now, if you pop into your database, you should see the new row in the Revisions table with the contents you entered, but NULL values for person_id and page_id.
Let’s remedy that.

Open /app/controllers/revisions_controller.rb and add this line right after the @revision declaration in both def new and def create

@revision.update_attribute('person_id', Person.find_or_create_by_name(params[:person][:name]).id)

Reload http://0.0.0.0:3000/revisions/new and re-fill out the form. After hitting ‘Create’ you should have another Revision record in your database with some digit in the person_id field

Now, let’s tie the Revision form we created to a Page.
Open up /app/views/revisions/new.html.erb and change the form_for to:
< % form_for @revision, :url => new_revision_path, :html => {:method => :post} do |f| %>

Next, right after < %= f.text_area :contents %> add:
< %= f.hidden_field :page_id, :value => @page.id %>

Now, rename /app/views/revisions/new.html.erb to /app/views/revisions/_new.html.erb. This turns the form into a partial, which is fine, because Revisions only exist within Pages, never on their own.
Next, replace everything in /app/views/pages/edit.html.erb with

<h1>Update "< %= @page.title.gsub('_', ' ') %>"</h1>
< %= render :partial => 'revisions/new' %>

If you load up 0.0.0.0:3000/pages/1/edit right now, you’ll get a ‘Called id for nil’ error. That’s because the Revision partial is referencing @revision, which doesn’t exist in /pages. Yet.
Open up /app/controllers/pages_controller.rb and scroll to ‘def edit’.
Add @revision = @page.revisions.last || Revision.new right after the @page declaration

Refreshing 0.0.0.0:3000/pages/1/edit should now correctly render the form.
Fill it out, hit ‘Create’ and check the database. You should have a new record with a number in the page_id column.

But we can’t see it because 0.0.0.0:3000/pages/1 doesn’t show anything.

Open up /app/views/pages/show.html.erb and add:

<h2>< %= @page.title.gsub('_', ' ') %></h2>
<p>< %= @page.revisions.last.contents %></p>
<p>last edited by < %= @page.revisions.last.person.name %> on < %= time_ago_in_words(@page.revisions.last.created_at) %> ago</p>

to the first line and save.

Refreshing 0.0.0.0:3000/pages/1 should show the text you entered.

7. Update the Model, Controller, Views, and Routes to Acknowledge Wiki-fied Words (separated with an underscore ‘_’)

Open up /app/models/revision.rb and add:

def parsed_contents
contents = self.contents.gsub(/(w*_w*)/) {|match| "<a href='#{match.downcase}'>#{match.gsub('_', ' ')}</a>"}
end

Open up /app/controllers/pages_controllers.rb, find def show and change:
@page = Page.find(params[:id])
to

@page = Page.find_or_create_by_title(params[:title])
return redirect_to page_edit_url(:title => @page.title) if @page.revisions.empty?

Open up /app/controllers/revisions_controllers.rb, find def create and change each :
format.html... line to
format.html { redirect_to page_base_url(:page_title => @revision.page.title) }

Next, scroll down to def edit and change:
@page = Page.find(params[:id])
to
@page = Page.find_by_title(params[:title])

Now, in /app/views/pages/show.html.erb change the Edit link to build the new route:
< %= link_to 'Edit', page_edit_url(:title => @page.title) %> |

Lastly, in config/routes.rb, change
map.root :controller => 'pages', :action => 'show', :id => 1
to
map.root :controller => 'pages', :action => 'show', :title => 'Home_Page'

Restart your script/server and you should have a very basic wiki.

In Part 2, we’ll add formatting.

Want to see it in action? Play around with Wiki.Cullect.com

1. There are leaner Ruby frameworks to build wikis in (Camping or Merb come to mind.). I picked Rails for 2 reasons; My servers and deployment process was already set up for Rails, I wanted to loosely integrate this wiki into another existing Rails-based application (Cullect.com).

Workaround for IE Overly Accepting in Rails’ respond_to format

Looks like Microsoft’s Internet Explorer will accept any format a web server is willing to give it.

This doesn’t play nicely with Rails’ 2.0+ respond_to feature. A slick little bit of code that asks the browser what it wants and replies accordingly.

Here’s a conversation between Rails & Firefox

Firefox: “Hey Rails, I want this url”
Rails: “No problem, which format would you like it in?”
Firefox: “HTML, please.”
Rails: “Here you go.”

Here’s the same conversation with Internet Explorer

IE: “Hey Rails, I want this url”
Rails: “No problem, which format would you like it in?”
IE: “Whatcha got?”
Rails: “I’ve got Atom, and…”
IE: (interputting) “OK THANKS!”
Rails: “…um, what? I wasn’t finished, really? ok, here you go.”

I had ordered my code alphabetically, so ‘atom‘ came before ‘html‘, like this:

respond_to do |format|
format.atom
format.html

end

Because IE is so, um, accepting, I’ve needed to put ‘html‘ first:

respond_to do |format|
format.html
format.atom

end

For more on this issue:

Ruby on Rails Snippet for Changing Relative Paths to Absolute

If you have a bunch of text containing relative path hyperlinks, and you’d like to change to them to absolute paths, you might find this snippet helpful.


content = "some text with a <a href='/path/to/relative_link/'>relative link</a>"
link = "http://somedomain.com/"
content.gsub(/=('|")//, '=1*/').gsub(/*//, link.match(/(http|https)://[w.]+//)[0])

The asterisk ‘*’ is a hackey placeholder for the actual link swapped in with the second gsub. If you know of a way to pass in the link without needing the placeholder, awesome – paste it in the comments. Thanks.

Parsing Arbitrary XML Namespaces in Ruby with Hpricot

(This post inspired by a Ruby.MN conversation)

I’ve burned through at least 3 different XML parsing libraries in building Cullect. I started the built-in REXML (pro: built-in, con: heavy and slow) then moved to FeedTools (pro: easily parses the most common feed formats, con: no longer in active development). I have a lot of love for FeedTools, but in the end, it didn’t want to parse XML it didn’t already know about. Then, after a series of far less memorable libraries, I found _why‘s Hpricot (pro: written by _why, con: written by _why).

Hpricot is fast, lean, and doesn’t care about expected tags, namespaces, valid XML, it just makes it easy to get the data and attributes out of the XML.

Here’s an example of how I’m grabbing a feed item’s permalink using Hpricot

doc = Hpricot.XML(feed_contents)
items = (doc/:item)
items.each do |raw_item|
link = raw_item.%('pheedo:origLink') || raw_item.%('feedburner:origLink') || raw_item.%('link')
end

Notice how Hpricot doesn’t require anything special to grab pheedo namespace links or feedburner namespace links in comparison standard links. Just tell it what the tag is you’re looking for. Fast, easy, scalable.

Optimization Tips: Ruby on Rails and MySQL

Stop.

I’ve uncovered these tips after (at least) the 3rd refactoring effort of some fairly simple, straight-forward Rails code. Rails is great for getting ideas prototyped super fast. These tips will slow down development and make apps less portable. Continue reading only if you’re running a live app in production and not happy with how resource-intensive it is.

My approach to this round of optimization was watch the Load calculations in the development log and optimize transactions with a Load greater than 0.0009.

  • First, a rule of thumb: Development boxes are faster than production boxes. If it’s acceptably fast locally, then it’ll probably be a turtle in production.
  • Use Model.find_by_sql or Model.count_by_sql whenever possible.
    Slow: Person.find_by_name('JoeyJoeJoe')
    Fast: Person.find_by_sql("SELECT person.* WHERE person.name = 'JoeyJoeJoe'")
  • Don’t put keys, IDs, or other numbers within quotes in your Finds
    Slow: WHERE id = "1234"
    Fast: WHERE id = 1234
  • Only request the specific database column/model attribute you want
    Slow: Person.find_by_name('JoeyJoeJoe').height
    Fast: Person.find_by_sql("SELECT person.height WHERE person.name = 'JoeyJoeJoe'"). Put indexes on all these columns/attributes.
  • Use connection.insert, connection.update, connection.delete for database transactions performed on an array of models or transactions that don’t need the overhead of a model.
  • Slow: ...WHERE table_1.id = table_2.table1_id...
    Fast: ...table_1 JOIN table_2 ON table1.id = table_2.table1_id....
  • Many tiny database transactions are faster than 1 big one
    Slow: stuff = Stuff.find_by_sql("SELECT everything.* FROM everything JOIN (box_1, box_2) ON (everything.id = box_1.everything_id box_2.box1_id = box_1.id")
    Fast: boxes = Boxes.find_by_sql("SELECT box_1.everything_id FROM box_1 JOIN box_2 ON box_2.box1_id = box_1.id")
    followed by
    boxes.each do |box_1|
    stuff = Stuff.find_by_sql("SELECT everything.* FROM everything WHERE id = #{box_1.everything_id}
    end
  • If you’re running the InnoDB datastore (vs MyISAM) try cranking up your innodb_buffer_pool_size. I say start by doubling it. Seriously.
  • Again, if you’re running InnoDB and doing a anything more involved than the simplest ORDER BY, try cranking up your sort_buffer_size and read_rnd_buffer_size to something in the double-digit M range.
  • Also, if you’re comparing datastore engines, MyISAM has full text search, InnoDB doesn’t. So you’ll need a clever work around. There are a number of them.