Tuesday, 22 July 2008

Are Some RSS Formats More Reliable/Faster than Others?

via Twitter, I was asked the above question.

It’s a good question, cutting to the core of my ambivalence over the religious wars between RSS, Atom, etc.

The flavor of XML a feed is published in shouldn’t matter.

Neither to the publisher nor the receiver.

Any parser able to handle multiple flavors should be able to parse all flavors equally fast. Some parsing engines are built for one flavor of XML or another – rather than abstracted to parse XML in general. Then again, it’s trivial to spit out one XML format as another, so, maybe format is a conversation between the user agent and the server.

Eh. (Get a smarter parser, jeeez.)

From my studying of both RSS and Atom, comparing them is like comparing the UIs of Windows and Macintosh. They do feel different. One puts window buttons over here, one puts them over there. One is this color, one is that color. One prefers the Control key, the other prefers the Command key. Some people prefer this one, others prefer that one. One says ‘potahtoe’. One, ‘potaytoe’.

From my I understanding, Atom was developed due to perceived deficiencies and ambiguity in the RSS 2.0 spec. Perhaps RSS 2.0 is guilty of being open to interpretation. I don’t know. I’ve found it to have logical places for everything I want to publish. Same for Atom.

Last I checked, Cullect was parsing somewhere north of 8100 feeds. Cullect doesn’t and shouldn’t care if a feed is Atom or RSS or RDF or filled with crazy namespaces. Cullect has 2 jobs when it comes to feeds; parse XML tags in a smart way, publish out useful feeds in whatever flavor the user agent requests.

The biggest issue I’ve found in parsing thousands of XML feeds is badly published XML. Feeds using the tags in bizarre ways. Feeds just not conforming to any spec. Feeds published in a way that just makes parsing hard.

Both RSS and Atom publishers are equally guilty. My Wacky-Feeds-That-Won’t-Parse list of contains just as many RSS feeds as Atom feeds.

A year ago, I wrote up my thoughts on publishing RSS 2.0 for easy publishing.

Wednesday, 16 July 2008

Monday, 14 July 2008

Snippet: Copy MySQL Databases Over SSH

I needed to copy a database and the idea of backing it up just to re-import1 seemed like double the work. Here’s a snippet to pipe a mysqldump into a remote database. Keep an eye on the user names and passwords – you’ll need 3 sets; one for the database your copying, one to get into your remote server, and one for the remote, target database.

mysqldump -v -uUSER -pPASSWORD --opt --compress DATABASE_NAME | ssh REMOTE_SERVER_USER@REMOTE_HOSTNAME mysql -uREMOTE-MYSQL-USER -pREMOTE_MYSQL_PASSWORD REMOTE_DATABASE_NAME

1. Backing up is a good thing. Why aren’t you doing it? Here’s a script for that.
mysqldump -h HOSTNAME DATABASE_NAME | gzip -9 > BACKUP_DIR/DATABASE_NAME.sql.gz

Saturday, 12 July 2008

Summer Camp Hookie

On this beautiful Saturday, I missed PublicRadioCamp. Unfortunate for many reasons, including – many of my favorite people in town were there. Hell, some of my favorite people in town organized it.

Instead, I had one of the best days ever with my family.

Our Day

  1. Sleep until after 7am
  2. Pick up a few things from the Minneapolis Farmers’ Market
  3. Grab a Coffee
  4. Take a Leisurely Stroll through the Sculpture Garden
  5. Home in time for lunch
  6. “Nap” – I finished my book club book during nap time, while the kids didn’t sleep soundly in their beds 😉
  7. A walk to Audobon Park for a dip, some swings, and an afternoon coffee.
  8. An amazing Pea-Mint-Leek soup dinner
  9. Mixing up bread dough with the boy before bedtime.
  10. A quiet Netflix and wine with Jen after bedtime.

After all that, I’m catching up what I missed with Bob Collins Off to Camp post. Good stuff.

Tuesday, 8 July 2008

Ruby on Rails Snippet for Changing Relative Paths to Absolute

If you have a bunch of text containing relative path hyperlinks, and you’d like to change to them to absolute paths, you might find this snippet helpful.


content = "some text with a <a href='/path/to/relative_link/'>relative link</a>"
link = "http://somedomain.com/"
content.gsub(/=('|")//, '=1*/').gsub(/*//, link.match(/(http|https)://[w.]+//)[0])

The asterisk ‘*’ is a hackey placeholder for the actual link swapped in with the second gsub. If you know of a way to pass in the link without needing the placeholder, awesome – paste it in the comments. Thanks.

Monday, 7 July 2008