Rafael Sanches

October 18, 2009

RSS parsing optimization for bandwidth and processing time with SAX and httpclient – pooling scripts

Filed under: android, maintainability, performance, programming — Tags: , , , — mufumbo @ 3:55 pm

My server was having a constant income traffic of 1.7mb/s for a service that download RSS from the internet and process them. Basically it need to return the last updates of multiple RSS feeds. It’s a very basic pooling system, but it was downloading too much data for just 15.000 active users. The growth wasn’t looking very feasible..

I was using the ROME java library to parse the XML. So far so good, the problem was that it downloads the whole feed and process it all. With my application scope I don’t need to download the whole RSS, just the new entries that i didn’t downloaded yet.

The solution was to use a custom SAX RSS parser, looping through the “” tags and identifying “”. In this way i can parse item per item, and identify if the current item is not updated, so I can abort the http connection and stop the download of the feed. I wish that ROME had an option to do that, like “stop processing when ‘publishedDate’ minor than..”.

The impact on bandwidth usage and processing time was impressive:

If someone is interested I can post and explain the java class. It’s compatible with com.sun.syndication.feed.synd and uses the SyndEntry and SyndFeed interfaces.

Advertisements

3 Comments »

  1. Cool. It’s lovely to see the change in the performance!
    It would be great to have a proxy service for that, so I can ask to the proxy for an RSS with a date and it will give me back just the latest posts. Can you build it?! 🙂

    Comment by Roberto — October 18, 2009 @ 11:57 pm

  2. Hello, Rafael
    I liked your blog post.
    Recently I’m trying to parse the RSS to get the new items. I’m seeing the excessive bandwidth usage while doing it. If you can post the sample code or a link to project that would be great.
    Thanks.

    Comment by Anil Madamala — January 8, 2012 @ 3:45 am

    • Hi Anil,

      What kind of project you need to do? Explain me better so I can tell you more details.

      thanks
      rafa

      Comment by mufumbo — January 8, 2012 @ 5:28 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: