Really Simple Syndication
Really Simple Syndication (RSS) is the most widely used Web syndication format. Since RSS is an XML application, it can be extended through XML namespaces. Beyond its conventional use of representing news and press releases, RSS also has special applications such as providing up-to-date exchange rates for banks [1].
The typical file extensions for RSS are .rss
and .xml
. The Internet media type associated with RSS is application/rss+xml
, which is not standardized yet [2].
RSS describes lightweight syndication channels with the properties title
, link
, and description
, as well as the classes channel
, item
, and image
.
RSS has the following versions: RSS 0.90, RSS 0.91, RSS 0.92, and RSS 2.0. The latter one has several sub-versions (RSS 2.0.1 with 6 revisions, as well as versions 2.0.8, 2.0.9, 2.0.10, and 2.0.11) [3]. In 2000, the name RDF Site Summary was in use, which referred to the extensibility with RDF-based modularization [4]. Version 0.91 was called Rich Site Summary, which dropped the RDF structure and imported elements from the scriptingNews syndication format. The current acronym is Really Simple Syndication. The latest version of the RSS Specification has a permanent URI at the RSS Advisory Board website [5].
The most widely used and most advanced version, RSS 2.0, is discussed in the next sections.
Required channel elements
The required channel elements of RSS 2.0 are title
, link
, and description
.
The title
element
The title
element represents the name of the channel. It often coincides with the title of the website it is associated with, e.g.,
<title>John Smith Headlines</title>
The link
element
The link
element is a URI representing the domain where the news feed is located, e.g.,
<link>http://example.com/</link>
The description
element
The description
is a sentence or sentence fragment that describes the channel, e.g.,
<description>The latest news about rock star John Smith.</description>
Optional channel elements
In RSS 2.0 news feeds the channel element has 16 optional sub-elements, including category
, cloud
, copyright
, docs
, generator
, image
, language
, lastBuildDate
, managingEditor
, pubDate
, rating
, skipDays
, skipHours
, textInput
, ttl
, and webMaster
.
A common feature of all RSS 2.0 elements providing a URL is that they should begin with a URI scheme defined by IANA [6], e.g., http://
, https://
, news://
, mailto://
, or ftp://
. Note that the http://
and ftp://
schemes cannot be used in earlier versions.
Namespaces
The default namespace for RSS is http://purl.org/rss/1.0/
, which is the permanent URL form of the RDF Site Summary (RSS) 1.0 namespace, http://web.resource.org/rss/1.0/
. The namespace can be provided in the form
<rss version="2.0" xmlns:rss="http://purl.org/rss/1.0/">
Additional data on channel updates can be provided by the Web syndication namespace of RSS (http://purl.org/rss/1.0/modules/syndication/
). It extends the RSS channels with three elements:
- The period over which the news channel is updated can be described by the
sy:updatePeriod
element. Allowed values arehourly
,daily
,weekly
,monthly
, andyearly
. If omitted, daily is assumed. - The frequency of updates can be expressed in relation to the update period with the
sy:updateFrequency
element. Its value is a positive integer. - To calculate the publishing schedule, a base date can be defined by the
sy:updateBase
element. It should be a #PCDATA date in one of the W3C date and time formats [7].
By default, news feed entries are plain text contents. However, news aggregators often support (X)HTML markup that are not allowed in XML. Entity-encoded and CDATA-escaped contents can be provided with the content:encoded
element defined by the http://purl.org/rss/1.0/modules/content/
namespace. The content:encoded
element is especially useful if the hyperlink delimited by the link
element is not enough and additional hyperlinks are needed (in the news item content). Although text formatting and other markup codes can also be written this way, they are ignored by many RSS readers.
There is an Atom element, atom:link
, that can be used to provide the self-link of the news feed channel. To apply this element, the Atom namespace http://www.w3.org/2005/Atom
should be declared.
Advanced news feeds typically contain at least the following namespace declarations:
<rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:atom="http://www.w3.org/2005/Atom"
>
Doing so, elements can be used from these namespaces in the channel as
<dc:creator>Dr. Leslie Sikos</dc:creator>
<sy:updatePeriod>daily</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<sy:updateBase>2011-01-01T12:00+00:00</sy:updateBase>
<atom:link href="http://www.lesliesikos.com/sikos.xml" rel="self" type="application/rss+xml" />
or in item elements such as
<content:encoded><![CDATA[ An escaped RSS item can contain markup elements such as <a href="http://www.example.com/">hyperlinks</a> that work in all major news feed readers. ]]></content:encoded>
Styling RSS feeds
The browsers that support news feeds usually provide a basic styling or no styling at all (rendering a tree structure instead). Developers who are not satisfied with that or want to ensure an advanced look (which is also similar in all browsers) can format RSS channels using CSS or XSLT.
In the first case, a CSS reference is required in the form
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/css" href="css/feed.css" ?>
<rss version="2.0">
Writing the CSS rules is straightforward. For example, the font size of the main title can be increased by
channel title {
font-size: 1.4em;
}
The font of the document can be set as
rss {
font-family: Verdana, Helvetica, sans-serif;
}
and so on. Much information is not necessarily relevant and can be omitted, for example:
channel link, channel language, channel copyright, channel managingEditor, channel webMaster, channel docs, channel lastBuildDate {
display: none;
}
The second approach applies XSL Transformation which provides more control. For example, hyperlinks can be activated and node order changed. The XSL file can be linked as follows:
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="css/feed.xsl" ?>
<rss version="2.0">
Note that additional functionality such as searching or category listings provided by the built-in RSS reader of certain browsers are not available when custom style sheet are applied to a news feed.
Not the only syndication format
Both RSS and Atom are widely supported in all major consumer feed readers. RSS seems to be more popular than Atom, though. From the standards points of view, however, the RSS 2.0 specification is copyrighted by Harvard University and is considered finalized. Significant changes cannot be expected, although the specification has been released under the Creative Commons license. In contrast, Atom 1.0 is a more feature-rich syndication format by default and can be extended.
The Internet media type application/rss+xml
is unregistered while application/atom+xml
is registered by IANA.
In contrast to RSS 2.0, which supports the RSS document format only, the Atom Entry documents of the Atom news feeds can apply any network protocol. As a result, the aggregation and extraction of Atom news feeds have more possibilities.
Although the namespace of RSS 2.0 is not an XML namespace, it can optionally contain elements from external XML namespaces (as discussed earlier). The namespace of Atom 1.0 is an XML namespace itself and might also have elements and attributes from other XML namespaces. The implementation of these external elements and attributes is clearly defined by specification guidelines. It can be concluded that Atom is more extensible than RSS.
RSS does not support relative URIs, while Atom reuses the xml:base attribute, which allows relative references.
There is no schema defined in RSS 2.0. Atom 1.0 applies the RelaxNG schema, which is the non-normative ISO-standard ISO/IEC 19757-2:2008 [8]. It can be used to validate the data provided in the Atom news feed. Optionally, further schemas can be generated from RelaxNG.
Practically, both formats are supported by most news feed readers. Correctly written RSS and Atom files are well-formed XML files that can be processed in many ways and can be extended using the namespace mechanism. Users usually do not notice the difference between the two formats when using a feed reader application.
You can read more about RSS from the book “Web Standards – Mastering HTML5, CSS3, and XML”.
References
- [1] Asman, P., Cannon, S., Sommo, C. (2010) Extending RSS to Meet Central Bank Needs. In: Proceedings of the International Conference on Dublin Core and Metadata Applications. Dublin Core Metadata Initiative, Pittsburgh
- [2] Cadenhead R, Smith G, Hanna J, Kearney B (2006) The application/rss+xml Media Type. The Internet Society. http://www.rssboard.org/rss-mime-type-application.txt. Accessed 22 November 2010
- [3] RAB (2010) Specification History. RSS Advisory Board. http://www.rssboard.org/rss-history. Accessed 23 November 2010
- [4] Beged-Dov G, Brickley D, Dornfest R, Davis I, Dodds L, Eisenzopf J, Galbraith D, Guha RV, MacLeod K, Miller E, Swartz A, van der Vlist E, et al (2008) RDF Site Summary (RSS) 1.0. RSS-DEV Working Group. http://web.resource.org/rss/1.0/spec. Accessed 23 November 2010
- [5] RAB (2008) The current version of the RSS Specification. RSS Advisory Board. http://www.rssboard.org/rss-specification. Accessed 23 November 2010
- [6] The Internet Corporation for Assigned Names and Numbers (2010) Permanent URI Schemes. Internet Assigned Numbers Authority. http://www.iana.org/assignments/uri-schemes.html. Accessed 26 November 2010
- [7] Wolf, M., Wicksteed, C. (1997) Date and Time Formats. World Wide Web Consortium. http://www.w3.org/TR/NOTE-datetime. Accessed 27 November 2010
- [8] ISO (2008) ISO/IEC 19757-2:2008. Information technology – Document Schema Definition Language (DSDL) – Part 2: Regular-grammar-based validation – RELAX NG. International Organization for Standardization. www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=52348. Accessed 07 December 2010