
One of the most important concepts of HTML5 is the unified parsing model. In earlier versions of the HTML specification, the implementer had a lot of freedom. Basically one could choose how to handle certain scenarios. The resulting divergence was among the most important reasons for different interpretations of many websites across multiple browsers. With HTML5 the error handling has been specified in great detail, leaving no room for interpretations.
In this first article of the “HTML5 Mastery” series, we will have a look at the error handling algorithms specified in the standard. We will see that some of these potential errors are actually accepted behavior, making them usable in popular websites. The main part of this article will discuss the scoping rules, which cover most of the experienced behavior as they trigger certain algorithms. We start with a practical example that shows implied closing tags.
Real-World Examples
Two of the most used abilities of a standardized HTML5 parser are the automatic construction of a valid document structure and the insertion of implied end tags. We start with the latter. A really nice example to illustrate this feature is the construction of a simple list. The following code snippet shows an unsorted list with three items.
<ul> <li>First item <li>Second item <li>Third item </ul>
Even though we omit the closing tag (``), the page displays the right thing. The correct rendering is only possible if the DOM tree has been constructed correctly from the given source. The DOM tree is the representation of the DOM nodes in a tree hierarchy. A DOM node can be an element, a comment, text or some other construct, which is introduced in other parts of this series.
The DOM tree hierarchy starts at the so-called root and shows the root’s child nodes. A root is the first node of a tree. It has no parent element. Some nodes, such as elements, can have child nodes on their own. The tree gives us information about the actual constructed DOM, whereas the source just proposes a construction.
The HTML5 parser ensures that the omitted end tags are inserted before new list items are added. This fits with our intuition. Naturally we feel that the given list has to have three items. Before HTML5 we could actually end up with a single item that contained text and another item, which again contained text and an item with text.
In the browser we see the following rendering (left side). We can also inspect the DOM tree to check if the parser handled the scenario correctly (right side).

Even though Opera (here in version 31) has been used to make the screenshots above, the behavior is browser independent as it was specified in the HTML5 standard.
By looking at the source code of some very popular sites we will possibly see some oddities. For instance the error page displayed by Google contains the following markup.
<!DOCTYPE html> <html lang=en> <meta charset=utf-8> <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width"> <title>Error 404 (Not Found)!!1</title> <style>/* ... */</style> <a href=//www.google.com/><span id=logo aria-label=Google></span></a> <p><b>404.</b> <ins>That’s an error.</ins> <p>The requested URL <code>/error</code> was not found on this server. <ins>That’s all we know.</ins>
Even without specifying a or
element, the elements are constructed. Also the paragraphs have implied closing tags. Apparently paragraphs are treated similar to list items. They do not occur nested—at least when constructed from source by an HTML5 parser.
Therefore the browser generates the tree on the right, which will render the layout on the left.

One of the most important lines in the preceding code is the declaration of the correct HTML5 document type. Otherwise we might end up in quirks mode, which is essentially equivalent to undefined behavior in terms of cross-browser development.
Implied End Tags
The behavior we’ve seen in the previous section relies on the generation of implied end tags. The mechanism for generating implied end tags works by closing the current node while the current node is one of the following elements:
-
<dd>
or `</dd>
Subscribe below and we’ll send you a weekly email summary of all new Code tutorials. Never miss out on learning about the next big thing.
Update me weeklyEnvato Tuts+ tutorials are translated into other languages by our community members—you can be involved too!
Translate this post