
HTML5 Microdata is an HTML5 module defined in a separate specification, extending the HTML5 core vocabulary with attributes for representing structured data. Other machine-readable annotation formats to add structured data to the markup are RDFa (RDF in attributes) and JSON-LD. A limited format, introduced before all the others, is called microformats.
Global HTML5 Microdata Attributes
HTML5 Microdata represents structured data as a group of name-value pairs. The groups are called items, and each name-value pair is a property. Items and properties are represented by regular elements. To create an item, the itemscope
attribute is used. To add a property to an item, the itemprop
attribute is used on a descendant of the item (a child element of the container element), as shown in Listing 1.
Listing 1. A Person’s Description in HTML5 Microdata
<div itemscope="itemscope" itemtype="http://schema.org/Person">
<span itemprop="name">John Smith</span>
<img src="johnsmith.jpg" alt="John Smith" itemprop="image" />
John’s web site:
<a href="http://www.johnsmithexample.com" itemprop="url">johnsmithexample.com</a>
</div>
Property values are usually strings (sequences of characters) but can also be web addresses, as the value of the href
attribute on the a
element, the value of the src
attribute on the img
element, or other elements that link to or embed external resources. In Listing 1, for example, the value of the image
item property is the attribute value of the src
attribute on the img
element, which is johnsmith.jpg
. Similarly, the value of the url
item property is not the content of the a
element, johnsmithexample.com
, but the attribute value of the href
attribute on the a
element, which is http://www.johnsmithexample.com
. By default, however, the value of the item is the content of the element, such as the value of the name
item property in this example: John Smith
(delimited by the <span> and </span> tag pair).
The type of the items and item properties are expressed using the itemtype
attribute, by declaring the web address of the external vocabulary that defines the corresponding item and properties. In our example, we used the Person
vocabulary from http://schema.org
that defines properties of a person, such as familyName
, givenName
, birthDate
, birthPlace
, gender
, nationality
, and so on. The full list of properties is defined at http://schema.org/Person
, which is the value of the itemtype
. In the example, we declared the name with the name
property, the depiction of the person with the image
property, and his web site address using the url
property. The allowed values and expected format of these properties are available at http://schema.org/name
, http://schema.org/image
, and http://schema.org/url
, respectively.
The item type is different for each knowledge domain, and if you want to annotate the description of a book rather than a person, the value of the itemtype
attribute will be http://schema.org/Book
, where the properties of books are collected and defined, such as bookFormat
, bookEdition
, numberOfPages
, author
, publisher
, etc. If the item has a global identifier (such as the unique ISBN number of a book), it can be annotated using the idemid
attribute, as shown in Listing 2.
Listing 2. The Description of a Book in HTML5 Microdata
<div itemscope="itemscope" itemtype="http://schema.org/Book" itemid="urn:isbn:978-1-484208-84-7">
<img itemprop="image" src="http://www.masteringhtml5css3.com/img/webstandardsbook.jpg" alt="Web Standards" />
<span itemprop="name">Web Standards: Mastering HTML5, CSS3, and XML</span> by <a itemprop="author" href="http://www.lesliesikos.com">Leslie Sikos</a>
</div>
Although HTML5 Microdata is primarily used for semantic descriptions of people, organizations, events, products, reviews, and links, you can annotate any other knowledge domains with the endless variety of external vocabularies. Groups of name-value pairs can be nested in a Microdata property by declaring the itemscope
attribute on the element that declared the property (see Listing 3).
Listing 3. Nesting a Group of Name-Value Pairs
<div itemscope="itemscope">
<p>Name: <span itemprop="name">Herbie Hancock</span></p>
<p>Band: <span itemprop="band" itemscope="itemscope">
<span itemprop="name">The Headhunters</span>
(<span itemprop="size">7</span> members)
</span>
</p>
</div>
In the preceding example, the outer item (top-level Microdata item) annotates a person, and the inner one represents a jazz band.
An optional attribute of elements with an itemscope
attribute is itemref
, which gives a list of additional elements to crawl to find the name-value pairs of the item. In other words, properties that are not descendants of the element with the itemscope
attribute can be associated with the item using the itemref
attribute, providing a list of element identifiers with additional properties elsewhere in the document (see Listing 4). The itemref
attribute is not part of the HTML5 Microdata data model.
Listing 4. Using the itemref
Attribute
<div itemscope="itemscope" id="herbie" itemref="a b"></div>
<p id="a">Name: <span itemprop="name">Herbie Hancock</span></p>
<div id="b" itemprop="band" itemscope="itemscope" itemref="c"></div>
<div id="c">
<p>Band: <span itemprop="name">The Headhunters</span></p>
<p>Size: <span itemprop="size">7</span> members</p>
</div>
The first item has two properties, declaring the name of jazz keyboardist Herbie Hancock
, and annotates his jazz band separately on another item, which has two further properties, representing the name of the band as The Headhunters
, and sets the number of members to 7
using the size property.
HTML5 Microdata DOM API
HTML5 Microdata has a DOM API for web developers to programmatically access the structured data represented in HTML5 Microdata.
You can read more on HTML5 Microdata in the book Mastering Structured Data on the Semantic Web.