<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	>
<channel>
	<title>Comments on: Parsing Arbitrary XML Namespaces in Ruby with Hpricot</title>
	<atom:link href="http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/feed" rel="self" type="application/rss+xml" />
	<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot</link>
	<description>User Experience Strategy, Ruby and Rails Web App Development</description>
	<lastBuildDate>Tue, 09 Mar 2010 17:56:38 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Mark Turner</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56825</link>
		<dc:creator>Mark Turner</dc:creator>
		<pubDate>Thu, 28 Aug 2008 14:30:45 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56825</guid>
		<description>I&#039;m having some problems parsing those namespace tags with hpricot, when I try to get the contents (eg: date = entry.at(&#039;gd:when&#039;).attributes[&#039;startTime&#039;]) I get nil returned.

Anyone know why this is?</description>
		<content:encoded><![CDATA[<p>I&#8217;m having some problems parsing those namespace tags with hpricot, when I try to get the contents (eg: date = entry.at(&#8217;gd:when&#8217;).attributes['startTime']) I get nil returned.</p>
<p>Anyone know why this is?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Garrick Van Buren</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56382</link>
		<dc:creator>Garrick Van Buren</dc:creator>
		<pubDate>Fri, 30 May 2008 04:21:02 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56382</guid>
		<description>Bob - thanks. I&#039;m sorry posts in Cullect are being assigned to someone other than yourself. It&#039;s a known bug that will be remedied in an upcoming release.

&lt;strong&gt;Update - a couple hours later:&lt;/strong&gt;
Bob, the entries should now be assigned to their rightful authors. Thanks again - for FeedTools, for stopping by and leaving comments, and for encouraging me to resolve this bug sooner rather than later. </description>
		<content:encoded><![CDATA[<p>Bob &#8211; thanks. I&#8217;m sorry posts in Cullect are being assigned to someone other than yourself. It&#8217;s a known bug that will be remedied in an upcoming release.</p>
<p><strong>Update &#8211; a couple hours later:</strong><br />
Bob, the entries should now be assigned to their rightful authors. Thanks again &#8211; for FeedTools, for stopping by and leaving comments, and for encouraging me to resolve this bug sooner rather than later.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Aman</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56381</link>
		<dc:creator>Bob Aman</dc:creator>
		<pubDate>Fri, 30 May 2008 04:09:05 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56381</guid>
		<description>Oh, and the rest of my content that wasn&#039;t usurped by the splog?  That&#039;s showing up as belonging to several different Planet sites.  All of whom correctly republished my content with the same id.  But it&#039;s still MY content, not theirs.</description>
		<content:encoded><![CDATA[<p>Oh, and the rest of my content that wasn&#8217;t usurped by the splog?  That&#8217;s showing up as belonging to several different Planet sites.  All of whom correctly republished my content with the same id.  But it&#8217;s still MY content, not theirs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Aman</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56380</link>
		<dc:creator>Bob Aman</dc:creator>
		<pubDate>Fri, 30 May 2008 04:06:09 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56380</guid>
		<description>Garrick:

I plugged my own blog into Cullect.  Several of the entries had been usurped by a Splog that had republished my content, but had used the same entry ids as the original posts.  The splog&#039;s content entered the database before the legitimate entries, so the legitimate content was ignored.

If I wanted to be a total dick, I could write a script that would hammer your feed, and when a new entry was detected, it&#039;d parse out the entry id, generate a new feed containing the same entry id, and say, a goatse image for content, and then immediately subscribe to it in Cullect.  Because the initial parse will happen before the next polling of your feed, you never get to see your blog again.  Just goatse images, forever and ever.  Also, your bandwidth charges would be insane.

All in all, it&#039;s a really, really bad idea.  Not to mention the fact that duplicate ids?  Yeah, they&#039;re super-common.  Sometimes you&#039;ll even get duplicate ids within the same feed.  The correct primary key is simply a bog-standard auto-incrementing integer, and then just make sure the guid field has an index on it.</description>
		<content:encoded><![CDATA[<p>Garrick:</p>
<p>I plugged my own blog into Cullect.  Several of the entries had been usurped by a Splog that had republished my content, but had used the same entry ids as the original posts.  The splog&#8217;s content entered the database before the legitimate entries, so the legitimate content was ignored.</p>
<p>If I wanted to be a total dick, I could write a script that would hammer your feed, and when a new entry was detected, it&#8217;d parse out the entry id, generate a new feed containing the same entry id, and say, a goatse image for content, and then immediately subscribe to it in Cullect.  Because the initial parse will happen before the next polling of your feed, you never get to see your blog again.  Just goatse images, forever and ever.  Also, your bandwidth charges would be insane.</p>
<p>All in all, it&#8217;s a really, really bad idea.  Not to mention the fact that duplicate ids?  Yeah, they&#8217;re super-common.  Sometimes you&#8217;ll even get duplicate ids within the same feed.  The correct primary key is simply a bog-standard auto-incrementing integer, and then just make sure the guid field has an index on it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Garrick Van Buren</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56377</link>
		<dc:creator>Garrick Van Buren</dc:creator>
		<pubDate>Thu, 29 May 2008 14:21:47 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56377</guid>
		<description>Bob, thanks. You bring up an interesting point. Could you expand on what you mean by &#039;legitimate&#039; entry?</description>
		<content:encoded><![CDATA[<p>Bob, thanks. You bring up an interesting point. Could you expand on what you mean by &#8216;legitimate&#8217; entry?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Aman</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56376</link>
		<dc:creator>Bob Aman</dc:creator>
		<pubDate>Thu, 29 May 2008 14:07:35 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56376</guid>
		<description>Garrick:

Two things.

One, Cullect&#039;s handling of feed entry ids is broken.  Incorrect behavior: Entry stored in database uniquely identified by its id element.  Correct behavior: Entries stored in database as children of the parent feed, with read status uniquely identified by the entry&#039;s id element.

The reason what you&#039;ve done is incorrect behavior is simple:  I can hijack anyone else&#039;s content trivially.  I just have to write some other content with the same entry id, and get it plugged into the database before you poll the legitimate entry.

And second, because I ran afoul of the entry id issue, I was unable to complete my run of the Atom XML namespace conformance tests.  Once you fix the first bug, I guess you&#039;re welcome to run them yourself.

http://www.intertwingly.net/wiki/pie/XmlNamespaceConformanceTests</description>
		<content:encoded><![CDATA[<p>Garrick:</p>
<p>Two things.</p>
<p>One, Cullect&#8217;s handling of feed entry ids is broken.  Incorrect behavior: Entry stored in database uniquely identified by its id element.  Correct behavior: Entries stored in database as children of the parent feed, with read status uniquely identified by the entry&#8217;s id element.</p>
<p>The reason what you&#8217;ve done is incorrect behavior is simple:  I can hijack anyone else&#8217;s content trivially.  I just have to write some other content with the same entry id, and get it plugged into the database before you poll the legitimate entry.</p>
<p>And second, because I ran afoul of the entry id issue, I was unable to complete my run of the Atom XML namespace conformance tests.  Once you fix the first bug, I guess you&#8217;re welcome to run them yourself.</p>
<p><a href="http://www.intertwingly.net/wiki/pie/XmlNamespaceConformanceTests" rel="nofollow">http://www.intertwingly.net/wiki/pie/XmlNamespaceConformanceTests</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Enfranchised Mind &#187; The Status of Ruby&#8217;s libxml</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56369</link>
		<dc:creator>Enfranchised Mind &#187; The Status of Ruby&#8217;s libxml</dc:creator>
		<pubDate>Tue, 27 May 2008 11:40:03 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56369</guid>
		<description>[...] Hpricot really will handle namespaces as nicely as Garrick says. Otherwise, I&#8217;m going to have to rewrite my app in Groovy/Grails and install a bunch of new [...]</description>
		<content:encoded><![CDATA[<p>[...] Hpricot really will handle namespaces as nicely as Garrick says. Otherwise, I&#8217;m going to have to rewrite my app in Groovy/Grails and install a bunch of new [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Fischer</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56352</link>
		<dc:creator>Robert Fischer</dc:creator>
		<pubDate>Wed, 21 May 2008 12:03:34 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56352</guid>
		<description>Sweet.  You made me a pretty happy camper!</description>
		<content:encoded><![CDATA[<p>Sweet.  You made me a pretty happy camper!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Garrick Van Buren</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56349</link>
		<dc:creator>Garrick Van Buren</dc:creator>
		<pubDate>Wed, 21 May 2008 02:14:46 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56349</guid>
		<description>@Robert, 

1. Is this what you&#039;re looking for?:
&lt;code&gt;
test = &quot;&lt;a&gt;&lt;b&gt;&lt;c&gt;hello&lt;/c&gt;&lt;/b&gt;&lt;/a&gt;&quot;
doc = Hpricot.XML(test)
t = (doc/:a)
t.inner_html
=&gt; &quot;&lt;b&gt;&lt;c&gt;hello&lt;/c&gt;&lt;/b&gt;&quot;
&lt;/code&gt;

2. Yes, in addition to RSS, Cullect both parses and generates Atom feeds (it also generates YML, M3U, PLS, JSON feeds).</description>
		<content:encoded><![CDATA[<p>@Robert, </p>
<p>1. Is this what you&#8217;re looking for?:<br />
<code><br />
test = "&lt;a&gt;&lt;b&gt;&lt;c&gt;hello&lt;/c&gt;&lt;/b&gt;&lt;/a&gt;"<br />
doc = Hpricot.XML(test)<br />
t = (doc/:a)<br />
t.inner_html<br />
=&gt; "&lt;b&gt;&lt;c&gt;hello&lt;/c&gt;&lt;/b&gt;"<br />
</code></p>
<p>2. Yes, in addition to RSS, Cullect both parses and generates Atom feeds (it also generates YML, M3U, PLS, JSON feeds).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Fischer</title>
		<link>http://garrickvanburen.com/archive/parsing-arbitrary-xml-namespaces-in-ruby-with-hpricot/comment-page-1#comment-56348</link>
		<dc:creator>Robert Fischer</dc:creator>
		<pubDate>Wed, 21 May 2008 02:00:04 +0000</pubDate>
		<guid isPermaLink="false">http://garrickvanburen.com/?p=1443#comment-56348</guid>
		<description>Is there a way to print out something like the verbatim XML from &quot;raw_item&quot;?

And, on a totally different note, does Cullect support Atom in addition to RSS, or just RSS?</description>
		<content:encoded><![CDATA[<p>Is there a way to print out something like the verbatim XML from &#8220;raw_item&#8221;?</p>
<p>And, on a totally different note, does Cullect support Atom in addition to RSS, or just RSS?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
