Scatter/Gather thoughts

by Johan Petersson

A short note on HTML and XML terminology

While reading about the recent rel="nofollow" hoopla I was reminded of something annoying I've observed many times before. When discussing HTML and XML markup, please use the standard terminology! There are a few basic terms I think anyone working with these markup languages should know:

Tags are the names enclosed in angle brackets:

<blockquote>

Documents are built from elements, usually specified with a start tag and an end tag, with content inbetween:

<p>content</p>

Some elements have no content and need only one tag:

<hr/>

Elements may also need attributes, which have values. Here's an img element where src is an attribute and example.gif is the value of that attribute:

<img src="example.gif"/>

That wasn't too difficult, was it? And yet, lots of people who really ought to know better get this wrong. Announcements by Google and Yahoo refer to rel="nofollow" as an attribute. Perhaps close enough for government work, but not quite right. Jeremy Zawodny (who wrote the Yahoo search blog entry) calls it the "nofollow attribute" in his own blog; sloppy terminology shared with Search Engine Watch, among others.

Calling rel="nofollow" or nofollow a tag appears to be the most common error. The examples are numerous, but worst of all may be the official MSN Search blog entry with its <rel="nofollow"> tag. An IE-specific feature?

OK, you and I can still understand what they are trying to say. That's not the point. Professionals in any given field should consistently use proper terminology, primarily to avoid confusion but also because doing so allows for clear and efficient communication. When was the last time you heard a carpenter ask for a chisel when he meant a screwdriver?

11 February, 2005