When it comes to RSS feed parsing, any Java developer will use the org.xml.sax library. But Android provides a nice wrapper that simplifies the code without sacrificing the performances. This wrapper is present since API level 1 and is the easiest way to write a simple parser for any XML.

In this tutorial, I’ll show you how to parse a RSS feed using the android.sax package. I’ll use Geek Garage’s feed. As any XML parsing, using the android.sax package, you can ignore the elements you don’t need, so, I’ll only collect some of the items.

This tutorial is not an app development tutorial, I’ll only focus on the parsing. If you want to include this code to your project, remember to check the network availability, keep your app responsive and manage the errors.

Anatomy of a RSS feed

The root element of any RSS feed is a channel. The channel contains some elements among which 3 are mandatory :

  • title: title of the feed, remember that a single website can host several feeds.
  • description: the description of the feed.
  • link: link to the website hosting the feed. If you subscribe to the tutorial feed on Geek Garage, the link item will still be Geek Garage’s homepage.

Then, the feed contains a list of item elements. I’ll only look for those:

  • title: title of the post or article.
  • description: a short description of the post, this can be the first sentences of the article.
  • link: the link to the article if you want to offer your user a way to access the article.

I know, the coincidence is that I’ll look for the same data kind from the feed and the items. Beware that those are different kind of data.

The data object.

As usual, wee need to store the parsed data and we can define the classical Feed and Item classes. This is a simple Item class:

package net.labasland.example.feed;

public class Item {
    private String title;
    private String description;
    private String link;

    public item(String title, String description, String link) {
        this.title = title;
        this.description = description;
        this.link = link;
    }

    public String getTitle() {
        return this.title;
    }

    public String getDescription() {
        return this.description;
    }

    public String getLink() {
        return this.link;
    }
}

This is a very basic data container object declaration. You should of course adapt it to your app.

Parsing the feed

All the parsing will be executed by the parse method from the android.util.Xml utility class. You’ll have to provide to this class a String (a raw XML), an InputStream or a Reader as the XML content to be parsed, and a ContentHandler. The ContentHandler instance provides the document-related events, so we have to create those.

We just have to define a listener for every element we want to process and ignore every element we don’t need. To add the content to our object, we just have to define a EndTextListener for those Elements we want to process. For the feed title, description and link, I will add the content to a RssFeed object (not shown here, simple POJO with an add(Item item) method to add the Item object to a collection). All the child elements from a item are added to a HashMap during the parsing of the element. Once the end of the item is reached, a new Item object is created using the HashMap elements, this Item object is than added to the RssFeed object and the HashMap is cleared.

Once the behavior is defined, the ContentHandler can be used in the Xml.parse call.

The following code do focus on the method, the rest of the class is not shown.

public RssFeed parse() throw IOException {

    RssFeed feed = new RssFeed();

    final HashMap<String, String> message = new HashMap<String, String>();
    RootElement root = new RootElement("rss");
    Element channel = root.getChild("channel");

    channel.getChild("title").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            feed.setTitle(body);
        }
    }

    channel.getChild("description").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            feed.setDescription(body);
        }
    }

    channel.getChild("link").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            feed.setLink(body);
        }
    }

    Element item = channel.getChild(ITEM);

    item.setEndElementListener(new EndElementListener(){
        public void end() {
            feed.add(new Item(message.get("title"), message.get("description"), message.get("link")));
            message.clear();
        }
    });

    item.getChild("title").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            message.put("title", body);
        }
    });

    item.getChild("description").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            message.put("description", body);
        }
    });

    item.getChild("link").setEndTextElementListener(new EndTextElementListener(){
        public void end(String body) {
            message.put("link", body);
        }
    });

    try {
        Xml.parse(this.getInputStream(), Xml.Encoding.UTF_8, root.getContentHandler());
    } catch (SAXException e) {
        throw new RuntimeException();
    }

    return feed;
}

Pro and cons of the Android.sax library

As you can see, using the android.sax library, we still define the behavior of the parser when it encounters the start and end of an element. But this library focus on Element objects. Once we define the element we want to process, we declare what to do before entering the element and once reaching its end. For readability and maintenance, this is far better than searching the conditions in the startElement and endElement in the DefaultHandler object, which is the one commonly extended.

We also don’t have to deal of the text data within an element. Using the DefaultHandler, we would have to build a StringBuffer and process it in the endElement method, depending of the closing tag. The EndTextElementListener allows to get ride of that cumbersome code.

The element manipulation approach also allows us more flexibility. I could use the same parse method to process the comments feed. But for the comment feed, I don’t want to get the feed description, so I’ll just have to surround the channel.getChild(“description”) call with the best condition.

So, using the android.sax library allows us to write simpler, readable, extendable and maintainable code.

But of course, this library does not give access to all the methods from the ContentHandler. If you need some specific processing, the android.sax library may be restricted. Of course, those classes are just wrappers, you may extend the RootElement, but you have to consider what your benefits are.

So, if you have to collect some simple data from XML files, you should consider the android.sax library available since API level 1. It will make your code simple, readable and maintainable without performance loss.

About Darko Stankovski

Darko Stankovski is the founder and editor of Dad 3.0. You can find more about him trough the following links.