Biscuits (cookies or bread) on a plate

Who wants a biscuit?

Up until recently, the way a lot of people learned HTML was by checking the source code of a page and seeing how it was written. While this is an excellent way to “get under the hood” and really get your hands dirty in a learning experience, there are some drawbacks.

For example, as with learning the meanings of unfamiliar words from context, there’s always the chance that the true meaning of the word, or, in our case, the actual use of an element, will be misunderstood. That’s especially true if the element wasn’t used correctly in the first place. Many a web page has been written with poor semantics, either out of ignorance of correct tag usage or because of the that’s-the-way-we’ve-always-done-it attitude, and learning how to write HTML from those pages just perpetuates poor semantics.

Why should we be wary of semantics and write semantically correct HTML? Good question. Let’s start with what semantics means. Semantics refers to proper meaning. If you are talking about what you had for snack and tell someone you had a biscuit, you could be talking about a cookie (if you were British) or a type of bread (if you were American). In order to understand the meaning of the word “biscuit” you must know the context, in this case whether the speaker is British or American.

In the realm of HTML, there is only one context (conveyance of information) and everyone who writes HTML understands it. This is, of course, where my analogy starts to break down. So, if everyone understands the context of HTML why are good semantics important? To understand that, we need to go back a bit in time and look at how HTML was first used.

The basic elements of HTML (<p>, <h1>, etc.) format text in specific ways. The  <p> element always starts text on a new line and includes a line break with each use. The <h1> tag formats text as very large and includes a line break. People started using the tags not as ways of identifying the different parts of information, but as formatting elements to get the text to look the way they wanted. Back before CSS, designers even used the <table> element to format a page in a grid, with each cell of the table containing a sliced up piece of the page. With this use of tags, the actual meaning of the element was lost. In other words, semantics be damned.

So back to our question: why are good semantics important? There are several reasons, ranging from accessibility to cost reduction. We’ll explore a few of them here, along with some specific examples of HTML tags and their proper use.

First, accessibility. Enter the text reader. This is a device that allows sight-limited people to browse the web. It reads each page, not by reading the text of the page as we see it, but by reading the tags in the source code to get oriented and then reading the text defined by the tags. As I understand them to work, the text reader looks in the source code for a title (<h1>) to read before reading any other content. If what is defined by the <h1> tag is not actually the title of the content, but instead, something trivial that the page designer wanted to be larger and set apart from the rest of the content, then you can imagine the annoyance and frustration of the user at not being able to parse the information on the page and find what he is looking for.

Let’s look at three examples of HTML tags and how they’re used with proper semantics.

Blockquote

The <blockquote> element defines a block of quoted text. This is useful to a sight-limited user because the text reader reads the tag and the content is defined. As for formatting, the <blockquote> tag starts a new line and indents the content contained in the tag. The tag allows attributes to further identify the content, namely, “cite” and “title”, which make the content that much more relevant to the user. In proper use, the code looks like this:

<blockquote cite=”http://www.famousquotesandauthors.com/authors/f__d__roosevelt_quotes.html” title=”Franklin D. Roosevelt”> “Freedom of conscience, of education, of speech, of assembly are among the very fundamentals of democracy and all of them would be nullified should freedom of the press ever be successfully challenged.”</blockquote>

and the output looks like this (note that the quote itself is indented and when the mouse hovers over the quote, the assigned “title” attribute appears.):

Example of blockquote tag output

Code & Preformatted Text

The <code> element defines computer code and allows the browser to display the code as written rather than render it. This is useful for writing tutorials, etc, and browsers display the content of the <code> tag in the default monospace typeface. One caution, however, is make sure the line length isn’t so long that the line gets cut off or causes a horizontal scroll bar. (The blockquote example isn’t contained in a <code> set because it got cut off.) In proper use it looks like this:

<code>
&lt;p&gt;This is an example of the paragraph tag and it contains an example of the &lt;em&gt;emphasis&lt;/em&gt; tag, which displays what’s inside the container in italics.&lt;/p&gt;
</code>

and the output looks like this:

example of code tag output

To display computer code with good code formatting, use the the preformatted text tag. The <code> tag set is nested inside the <pre> tag set and forces the browser to retain the spaces and returns written in the source code. Proper use looks like this:

<pre>
   <code>
      h2
         { font-size: 2em;
           font-family: arial;
           color: orange;
           border: 3px solid red;
         }
   </code>
</pre>

and the output looks like this:

example of pre tag output

The designer doesn’t have to deal with any other formatting and the content is properly defined. You’ll notice there’s a difference in how the two examples are displayed. This is due to how WordPress CSS handles the different tags. Both, however, are displayed in fonts other than that of the body copy.

Back to Good Semantics

In the past, these tags have been used to display text as indented, monospaced, or with  the spacing in the source code, not necessarily to define the actual content of the page. Using HTML elements for design defeats the purpose of the tags and renders them useless for content management. In addition, misuse of the tags and not following good semantics practice can lead to code bloat and cost increases.

Modern practice in creating web pages uses HTML to define the content and CSS (cascading style sheets) to format the content. With this ideal, code can be simpler, cleaner, and easier to manage. If HTML is used to handle the display of the content, then every time a change needs to be made to the display someone has to go through the code line by line and make the changes. This can take hours and be very costly.

If the formatting is handled externally by CSS, then it’s an easy change of the whole site. Only one line needs to be changed in order to make all blocks of quoted text render in blue serifed font, as long as the HTML tags have been used properly. Applying that change is as simple as defining <blockquote> in the CSS to be blue and serifed.

So, you can see how practicing good semantics makes the web a friendlier place, both for users and authors. Now, who wants a biscuit?

Advertisements