Rafael Sanches

October 18, 2009

RSS parsing optimization for bandwidth and processing time with SAX and httpclient – pooling scripts

Filed under: android, maintainability, performance, programming — Tags: , , , — mufumbo @ 3:55 pm

My server was having a constant income traffic of 1.7mb/s for a service that downloads RSS from the net and process them in order to return the last updates of multiple RSS feeds. It’s a very basic pooling system, but it was downloading too much data for just 5000 active users. The growth wasn’t looking very feasible.

I was using the ROME java library to parse the XML. So far so good, the problem was that it downloads the whole feed and process it all. With my application scope I don’t need to download the whole RSS, just the new entries that i didn’t downloaded yet.

The solution was to use a custom RSS parser, looping through the “” tags and identifying “”. In this way i can parse item per item, and identify if the current item is not updated, so I can abort the http connection and stop the download of the feed. I wish that ROME had an option to do that, like “stop processing when date minor than..”.

The impact on bandwidth usage and in processing time were impressive:

If someone is interested I can post and explain the java class. It’s compatible with com.sun.syndication.feed.synd and uses the SyndEntry and SyndFeed interfaces.

May 10, 2008

simple script to merge commits from a bugzilla id

Filed under: maintainability, programming — Tags: , , , , — mufumbo @ 9:15 pm

Today i have made my first PERL script!

For me it is very painful when it arrives the time to merge, into another branch, all the commits that i have done in the “trunk”. I have searched a little and did not find anything that could magically solve all my problems. I know that it’s better to create a separated branch when there are lot’s of commits, but there are some cases that a super-simple functionality can explode into a big ball of mud.

Practically the script merge all the commits of a bugzilla id to another branch. If someone knows a standard way to do this; please tell me!

The script take three inputs:

  1. The starting revision ID to filter the search.
  2. The SVN address of the source.
  3. The search string to filter the results. Here you put your bugzilla bug id.

Commands that are executed when you launch the script:

  1. Go to the directory of the destination branch.
  2. To execute the script simply do:
  3. svn_search_merge.pl 0 https://svn.example.com/main/trunk/ “1: “
  4. Note that “1: ” is the bugzilla bug id. What happens next is:
  5. svn log -r 1:HEAD https://svn.example.com/main/trunk/
  6. With that command we get the log of all commits from the revision 1 to the HEAD. After it’s just matter of check if the string “1: ” is inside the log. Then we simply execute:
  7. svn merge -r (ACTUAL_REVISION-1):ACTUAL_REVISION https://svn.example.com/main/trunk/

Source code of the script:

#!/usr/bin/perl

# Simple script to merge commits from a source branch to the current destination directory.
# http://mufumbo.wordpress.com/2008/05/10/simple-script-to-merge-commits-from-a-bugzilla-id/
#
# Example:
# $ cd my-branch-destination/
# $ svn_search_merge.pl 3000 https://svn.example.com/main/trunk/ "bug 673"
# Where 3000 is the starting revision and "bug 673" is the string to match in the comments.
#
use strict;
use warnings;

my $prev_revision = shift;
my $svnHost = shift;
my $searchStr = shift;

print "Starting Revision: $prev_revision\n";
print "SVN addr: $svnHost\n";
print "Search pattern: $searchStr\n";

my $buffer;
$buffer = `svn log -r $prev_revision:HEAD $svnHost`;
my $shouldContinue = "y";
LOGS: foreach my $changelog_entry (split(/----+/m, $buffer)) {
	if($changelog_entry =~ m/($searchStr)/) {
	        #my (undef, $info, undef, $comment) = split(/\n/, $changelog_entry);
	        #next unless $info =~ m/^r/;

		print "\n--------------------------------------------------";
		print $changelog_entry;
		my $revisionId = substr($changelog_entry, 2, 5);
		$revisionId =~ s/^\s+//;
		$revisionId =~ s/\s+$//;

		if ($shouldContinue ne 'a') {
			PROMPT: while(1) {
				print "\nShould continue with merge of revision '$revisionId'? (Yes,Always,Skip,Exit): ";
				$shouldContinue = <>;
				chomp($shouldContinue);

				last PROMPT if $shouldContinue eq 'y';
				last PROMPT if $shouldContinue eq 'a';
				next LOGS if $shouldContinue eq 's';
				die("User requested to stop.") if $shouldContinue eq 'e';
			}
		}
		else {
			print "\nAuto merging '$revisionId'\n";
		}

		my $pRevisionId = $revisionId-1;
		my $mergeBuffer = `svn merge -r $pRevisionId:$revisionId $svnHost`;
		print $mergeBuffer;
	}
}

Blog at WordPress.com.