9 Things Cullect Taught Me About Software

  • Forcing people to create an account to use your software is a bug.
  • if you’re not scared to deploy, you’ve stopped caring.
  • Murphy is alive and well.
  • Google and a bookself of technical books can be equally useless.
  • Good software is like an iceberg.
  • if you ask for money, people will give it to you.
  • Software is as hard as you make it. Don’t.
  • Most features can be removed and no one will notice.
  • A great project will eat everything in it’s path.

How To Cache Highly Dynamic Data in Rails with Memcache – Part 1

There are a number of ways increase Ruby on Rails performance through caching. Caching works because things don’t change….or don’t change frequently.

In Cullect, almost everything is dynamic, even Cullect’s HTML presentation format has 3 different states depending on access privileges and there are 8 other presentation formats available.

The standard page, action, and fragment caching make less sense when the ‘heaviest’ data are also the most dynamic – the feed items.

For the feed items, I’m using a building a custom memcached name-value-pair holding 10 attributes describing the request as the key name and the items themselves as the key value.

From the ‘show’ action in my controller:

key = "
#{item_count}:
#{attribute_1}:
#{attribute_2}:
#{attribute_3}:
#{attribute_4}:
#{etc...}"

Notice the first attribute in the key is the total number of items within a specific Cullect Reading List – which will change when a feed updates or an item is hidden – automatically expiring stale caches.

Then, I check the cache for the key and pull the items from the cache if it exists.

if Cache.get(key)
@allitems = Cache.get(key)

If there’s no key in the cache, I do the query and put the retrieved items into the cache.

else
@allitems = get_items(attribute_1, attribute_2, attribute_3, attribute_4, etc)
Cache.put key, @allitems
end

While this speeds up subsequent requests, there’s still the question of speeding up the initial request. I’ll save that for part 2.

How To: Build a Wiki with Ruby on Rails – Part 2

Back in Part 1 of Building a Wiki with Ruby on Rails, we built the core wiki engine.

Time to add some syntax formatting.

These days, I’m pretty enamored of Haml, it’s more like HTML (unlike many other formatting engines) and it’s fast to write (unlike many other formatting engines).

1. Install Haml

Haml installs as a gem…1
gem install --no-ri haml
and a plugin…
haml --rails path/to/your/wiki

2. Create Your Haml View

Open up app/views/show.html.erb
and convert it to Haml. Replacing it to:

.content
%h1= @page.title.gsub('_', ' ')
= link_to('Edit', page_edit_path(:page_title => @page.title))
= link_to('History', page_history_path(:page_title => @page.title))
= "Last Updated #{time_ago_in_words(@page.revisions.last.created_on)} ago by #{Person.find(@page.revisions.last.person_id).name}"
%p= auto_link(@page.revisions.last.parsed_contents)

Now, your wiki should look exactly like it did before we plugged in Haml.

3. Add Haml Support for Revisions

Open up app/models/revision.rb and update def parsed_contents with:

def parsed_contents
contents = contents.gsub(/(w*_w*)/) {|match| "<a href='#{match.downcase}'>#{match.gsub('_', ' ')}</a>"}
h = Haml::Engine.new(contents)
h.to_html
end

4. Try It Out

Load up a page in your wiki, click ‘edit’, and add in some simple Haml, something like:

%strong this should be bold

5. Add in Some More (non-Haml) Formatting

Once you’re comfortable with Haml, maybe add in some other simple formatting rules into revision.rb

#Auto-renders any image URL
contents = self.contents.gsub(/s(.*.(jpg|gif|png))s/, '<img src="1"/>')

# Bolds text inside asterisks
contents = contents.gsub(/*(.*)*/, '<strong>1</strong>')

# Wraps <code> around text inside '@'
contents = contents.gsub(/@(.*)@/, '<code>1</code>')

# Blockquotes text inside double-quotes
contents = contents.gsub(/s"(.*)"s/, '<blockquote>"1"</blockquote>')

# Italicizes text inside underscores
contents = contents.gsub(/s_(.*)_s/, '<em>1</em>')

6. Make it Better

After playing around with Haml and other formatting, you’ll quickly see some quirks – some room for improvement. Go for it, and let me know when you get something interesting.

1. Remember to vendor your gems.

How To: Build a Wiki with Ruby on Rails – Part 1

Here’s how to build a very simple Ruby on Rails-based wiki engine 1.

1. Create the Rails App

rails wiki
and cd wiki to get inside the app.
Remember: create your database (i.e. wiki_development) and update config/database.ymlappropriately

2. Generate the Scaffolding

There are 3 nouns in this wiki system: People, Pages, Revisions. Let’s create the models for each of them.
run script/generate scaffold Person
Now open up wiki/db/migrate/*_create_people.rb and add:
t.column "name", :string inside the create_table block to store the Person’s name.

Add Person.create :name => "First Person"
after the create_table block to give us an initial Person.

run script/generate scaffold Page
Now open up wiki/db/migrate/*_create_pages.rb and add:
t.column "title", :string inside the create_table to store the Page’s title and add
Page.create :title => "Home_Page" after the create_table block to give us an initial Page.

run script/generate scaffold Revision
Now open up wiki/db/migrate/*_create_revisions.rb and add:

# Reference to the Person that created the Revision
t.column "person_id", :integer

# The contents of the Revision
t.column "contents", :text

# Reference to the Page this Revision belongs to
t.column "page_id", :integer

3. Run the database migrations

rake db:migrate

4. Describe the relationships between the Models

People should be able to create many Revisions.
Open /app/models/person.rb and add – after the first line:
has_many :revisions

Pages should have many Revisions
Open /app/models/page.rb and add – after the first line:
has_many :revisions

Revisions should reflect they belong to both Pages and People
Open /app/models/revision.rb and add – after the first line:

belongs_to :page
belongs_to :person

5. Draw up the Routes

Open up config/routes.rb and update them to handle wiki-fied words. Copy the following after the map.resources block

map.root :controller => 'pages', :action => 'show', :id => 1

map.connect 'pages.:format', :controller => 'pages', :action => 'index'
map.connect 'revisions.:format', :controller => 'revisions', :action => 'index'

map.page_base ':title', :controller => 'pages', :action => 'show'
map.connect ':title.:format', :controller => 'pages', :action => 'show'

map.page_history ':title/history', :controller => 'pages', :action => 'history'
map.connect ':title/history.:format', :controller => 'pages', :action => 'history'

map.page_edit ':title/edit', :controller => 'pages', :action => 'edit'

Now run script/server and load up a browser.
If things are working correctly, you should see simply ‘Edit | Back’ as links

6. Create the Views

Open /app/views/revisions/new.html.erb and add

< %= f.text_area :contents %>

< % fields_for :person do |p| %>


< %= p.text_field :name %>

< % end %>

right after < %= f.error_messages %> for the Revision’s content, the corresponding Page and the author’s name
Loading up http://0.0.0.0:3000/revisions/new should give you the form with the text area, a text field, and a submit button you just created.

Try entering something into the form and click ‘Create’. You should see ‘Revision was successfully created.’ Now, if you pop into your database, you should see the new row in the Revisions table with the contents you entered, but NULL values for person_id and page_id.
Let’s remedy that.

Open /app/controllers/revisions_controller.rb and add this line right after the @revision declaration in both def new and def create

@revision.update_attribute('person_id', Person.find_or_create_by_name(params[:person][:name]).id)

Reload http://0.0.0.0:3000/revisions/new and re-fill out the form. After hitting ‘Create’ you should have another Revision record in your database with some digit in the person_id field

Now, let’s tie the Revision form we created to a Page.
Open up /app/views/revisions/new.html.erb and change the form_for to:
< % form_for @revision, :url => new_revision_path, :html => {:method => :post} do |f| %>

Next, right after < %= f.text_area :contents %> add:
< %= f.hidden_field :page_id, :value => @page.id %>

Now, rename /app/views/revisions/new.html.erb to /app/views/revisions/_new.html.erb. This turns the form into a partial, which is fine, because Revisions only exist within Pages, never on their own.
Next, replace everything in /app/views/pages/edit.html.erb with

<h1>Update "< %= @page.title.gsub('_', ' ') %>"</h1>
< %= render :partial => 'revisions/new' %>

If you load up 0.0.0.0:3000/pages/1/edit right now, you’ll get a ‘Called id for nil’ error. That’s because the Revision partial is referencing @revision, which doesn’t exist in /pages. Yet.
Open up /app/controllers/pages_controller.rb and scroll to ‘def edit’.
Add @revision = @page.revisions.last || Revision.new right after the @page declaration

Refreshing 0.0.0.0:3000/pages/1/edit should now correctly render the form.
Fill it out, hit ‘Create’ and check the database. You should have a new record with a number in the page_id column.

But we can’t see it because 0.0.0.0:3000/pages/1 doesn’t show anything.

Open up /app/views/pages/show.html.erb and add:

<h2>< %= @page.title.gsub('_', ' ') %></h2>
<p>< %= @page.revisions.last.contents %></p>
<p>last edited by < %= @page.revisions.last.person.name %> on < %= time_ago_in_words(@page.revisions.last.created_at) %> ago</p>

to the first line and save.

Refreshing 0.0.0.0:3000/pages/1 should show the text you entered.

7. Update the Model, Controller, Views, and Routes to Acknowledge Wiki-fied Words (separated with an underscore ‘_’)

Open up /app/models/revision.rb and add:

def parsed_contents
contents = self.contents.gsub(/(w*_w*)/) {|match| "<a href='#{match.downcase}'>#{match.gsub('_', ' ')}</a>"}
end

Open up /app/controllers/pages_controllers.rb, find def show and change:
@page = Page.find(params[:id])
to

@page = Page.find_or_create_by_title(params[:title])
return redirect_to page_edit_url(:title => @page.title) if @page.revisions.empty?

Open up /app/controllers/revisions_controllers.rb, find def create and change each :
format.html... line to
format.html { redirect_to page_base_url(:page_title => @revision.page.title) }

Next, scroll down to def edit and change:
@page = Page.find(params[:id])
to
@page = Page.find_by_title(params[:title])

Now, in /app/views/pages/show.html.erb change the Edit link to build the new route:
< %= link_to 'Edit', page_edit_url(:title => @page.title) %> |

Lastly, in config/routes.rb, change
map.root :controller => 'pages', :action => 'show', :id => 1
to
map.root :controller => 'pages', :action => 'show', :title => 'Home_Page'

Restart your script/server and you should have a very basic wiki.

In Part 2, we’ll add formatting.

Want to see it in action? Play around with Wiki.Cullect.com

1. There are leaner Ruby frameworks to build wikis in (Camping or Merb come to mind.). I picked Rails for 2 reasons; My servers and deployment process was already set up for Rails, I wanted to loosely integrate this wiki into another existing Rails-based application (Cullect.com).

Snippet: Copy MySQL Databases Over SSH

I needed to copy a database and the idea of backing it up just to re-import1 seemed like double the work. Here’s a snippet to pipe a mysqldump into a remote database. Keep an eye on the user names and passwords – you’ll need 3 sets; one for the database your copying, one to get into your remote server, and one for the remote, target database.

mysqldump -v -uUSER -pPASSWORD --opt --compress DATABASE_NAME | ssh REMOTE_SERVER_USER@REMOTE_HOSTNAME mysql -uREMOTE-MYSQL-USER -pREMOTE_MYSQL_PASSWORD REMOTE_DATABASE_NAME

1. Backing up is a good thing. Why aren’t you doing it? Here’s a script for that.
mysqldump -h HOSTNAME DATABASE_NAME | gzip -9 > BACKUP_DIR/DATABASE_NAME.sql.gz

Ruby on Rails Snippet for Changing Relative Paths to Absolute

If you have a bunch of text containing relative path hyperlinks, and you’d like to change to them to absolute paths, you might find this snippet helpful.


content = "some text with a <a href='/path/to/relative_link/'>relative link</a>"
link = "http://somedomain.com/"
content.gsub(/=('|")//, '=1*/').gsub(/*//, link.match(/(http|https)://[w.]+//)[0])

The asterisk ‘*’ is a hackey placeholder for the actual link swapped in with the second gsub. If you know of a way to pass in the link without needing the placeholder, awesome – paste it in the comments. Thanks.

Parsing Arbitrary XML Namespaces in Ruby with Hpricot

(This post inspired by a Ruby.MN conversation)

I’ve burned through at least 3 different XML parsing libraries in building Cullect. I started the built-in REXML (pro: built-in, con: heavy and slow) then moved to FeedTools (pro: easily parses the most common feed formats, con: no longer in active development). I have a lot of love for FeedTools, but in the end, it didn’t want to parse XML it didn’t already know about. Then, after a series of far less memorable libraries, I found _why‘s Hpricot (pro: written by _why, con: written by _why).

Hpricot is fast, lean, and doesn’t care about expected tags, namespaces, valid XML, it just makes it easy to get the data and attributes out of the XML.

Here’s an example of how I’m grabbing a feed item’s permalink using Hpricot

doc = Hpricot.XML(feed_contents)
items = (doc/:item)
items.each do |raw_item|
link = raw_item.%('pheedo:origLink') || raw_item.%('feedburner:origLink') || raw_item.%('link')
end

Notice how Hpricot doesn’t require anything special to grab pheedo namespace links or feedburner namespace links in comparison standard links. Just tell it what the tag is you’re looking for. Fast, easy, scalable.

Optimization Tips: Ruby on Rails and MySQL

Stop.

I’ve uncovered these tips after (at least) the 3rd refactoring effort of some fairly simple, straight-forward Rails code. Rails is great for getting ideas prototyped super fast. These tips will slow down development and make apps less portable. Continue reading only if you’re running a live app in production and not happy with how resource-intensive it is.

My approach to this round of optimization was watch the Load calculations in the development log and optimize transactions with a Load greater than 0.0009.

  • First, a rule of thumb: Development boxes are faster than production boxes. If it’s acceptably fast locally, then it’ll probably be a turtle in production.
  • Use Model.find_by_sql or Model.count_by_sql whenever possible.
    Slow: Person.find_by_name('JoeyJoeJoe')
    Fast: Person.find_by_sql("SELECT person.* WHERE person.name = 'JoeyJoeJoe'")
  • Don’t put keys, IDs, or other numbers within quotes in your Finds
    Slow: WHERE id = "1234"
    Fast: WHERE id = 1234
  • Only request the specific database column/model attribute you want
    Slow: Person.find_by_name('JoeyJoeJoe').height
    Fast: Person.find_by_sql("SELECT person.height WHERE person.name = 'JoeyJoeJoe'"). Put indexes on all these columns/attributes.
  • Use connection.insert, connection.update, connection.delete for database transactions performed on an array of models or transactions that don’t need the overhead of a model.
  • Slow: ...WHERE table_1.id = table_2.table1_id...
    Fast: ...table_1 JOIN table_2 ON table1.id = table_2.table1_id....
  • Many tiny database transactions are faster than 1 big one
    Slow: stuff = Stuff.find_by_sql("SELECT everything.* FROM everything JOIN (box_1, box_2) ON (everything.id = box_1.everything_id box_2.box1_id = box_1.id")
    Fast: boxes = Boxes.find_by_sql("SELECT box_1.everything_id FROM box_1 JOIN box_2 ON box_2.box1_id = box_1.id")
    followed by
    boxes.each do |box_1|
    stuff = Stuff.find_by_sql("SELECT everything.* FROM everything WHERE id = #{box_1.everything_id}
    end
  • If you’re running the InnoDB datastore (vs MyISAM) try cranking up your innodb_buffer_pool_size. I say start by doubling it. Seriously.
  • Again, if you’re running InnoDB and doing a anything more involved than the simplest ORDER BY, try cranking up your sort_buffer_size and read_rnd_buffer_size to something in the double-digit M range.
  • Also, if you’re comparing datastore engines, MyISAM has full text search, InnoDB doesn’t. So you’ll need a clever work around. There are a number of them.

Restarting Rails with ‘Address Already in Use’

iTerm crashed on my this morning taking down my development Rails process. Now, I’m busy tailing cullect.com‘s production.log and listening to some podcasts so I didn’t want to restart1

A little Googling found John Nunemaker’s year-old, “Oops I did it again” post.

His tips worked. An I’m rewriting the following for Future Reference2:

$ ps aux | grep script/server

To get the process id you need to kill

$ kill [PID #]

1. My standard way of restarting Rails in this situation.
2. If memory serves, this was the name of the first inhouse blog I started .