Hacking RSS: Filtering & Processing Obscene Amounts of Information

Dawn Foster, MeeGo Community Mgr Intel
Presentation Slides and Videos

295 Exabytes of data in 2007, amount doubles every 3 years, 4 months. Over 600+ Exabytes now. You want to find the needle in all of this data.

RSS Alone is a start. You can follow the sources you want, but…

  • Do you care about everything in each feed?
  • What about feeds you aren’t subscribed to?
  • Can you keep up with what you have?

Prioritize Your Reader (Google Reader)

  • Put thins you care about at the top (yahoo pipes, things you really really like)
  • Categorize
  • Don’t try to read everything. Get to what you can.

Outsource and Crowd-source New Sources

The Real Magic is in Filtering RSS

In Google reader, a yahoo pipe of analyst research blogs mentioning Online Community, a yahooo pipe of analyst research blogs mentioning Meego.
You need to filter out thing you don’t care about.
Another yahoo pipe pulls in favorite blogs using PostRank to find only the ones with a lot of comments or social mentions.

RSS Filtering Tools

  • Yahoo Pipes
    You can filter any data found in any field of the RSS feed.
  • FeedRinse
  • FeedDemon
  • Code your own


  • Takes the best posts in a feed
  • Ranks it on engagement (links/sharing/comments/etc.)
  • You can get the output as an RSS feed
  • Feed includes postrank number in a field which you can filter against.


  • Data about links on Twitter
  • Finds links regardless of shortening service
  • No RSS Feeds (no longer available)
  • But… You can use the API = Yahoo Pipes to build one!

Leave a Reply

Your email address will not be published. Required fields are marked *