Using Web Standards to Create HTML Files
Click a link below to jump to a particular section; click any "
CONTENTS" image following a section heading to jump back here.
More than writers in most industries, Help authors and technical writers know
about standards. We design them for our companies. We ask others to follow the
standards we've designed. We implement style guides, formatting guides, standard
procedures and techniques, and more.
We create well structured documents that use consecutive heading levels to
identify information hierarchies. We format documents using styles and we cringe
when we have to edit documents that aren't well structured or well formatted.
We've already learned the importance of structure and formatting. We know that
it costs money and time to fix incorrect documents. In this difficult economy,
when all companies are looking for solid returns on investments, it is important
to do what we can to save both time and money.
I've been working with a client to create HTML documentation generated from
Word documents, using Quadralay's WebWorks Publisher WordHelp. My team and I
reformatted and restructured 1200 Word documents and built six WordHelp projects.
Unfortunately, the previous consultant had implemented a hierarchy that sometimes
used Heading 2 followed by Heading 4 and sometimes Heading 2 followed by Heading
5, instead of a consistent heading structure (for example, Heading 1, Heading
2, and Heading 3). WordHelp relies on the structure of the Word documents to
build the table of contents. If the structure of the Word documents is inconsistent,
the online table of contents doesn't display correctly. The previous consultant
never taught the users how to restyle a paragraph, so we had to verify the formatting
of every paragraph in every document. Otherwise, the mappings in WordHelp would
cause the content to display in a completely different font or color or size.
It took us 600 hours to complete the six projects. Had the 1200 documents been
correctly structured and formatted, it would have taken a tenth of the time
(60 hours) to build the six projects.
Standards are important.
In this article, I'll discuss the latest standards from the World Wide Web
Consortium (W3C), what you need to know about them to create HTML pages, and
what these standards means to us as Help authors.
Why Do Web Standards Matter?
When you use Web standards to create and design your HTML pages, they are future-proof,
backward compatible, and cross-browser capable (the appearance may be different,
but the content will be displayed reliably). Pages that are designed to Web
standards work in new browsers without modification and work with any Internet
device, while still allowing users with older browsers to see the content. Using
standards doesn't cost more time than working without them (actually, the reverse
is usually true). And coding to standards means that your pages are easier to
In contrast, sites designed for specific browser functionality have to be redesigned
each time a new standard is released. For example, when I originally designed
my site at helpstuff.com and used rollover tabs, I targeted specific browsers
functionality, because my rollover tabs weren't designed to work with one of
the latest standards. This was after I had already spent hours getting the tabs
to work in Netscape 4. Since I had to rework the site anyway, I changed some
of the code to work with specific functionality, not specific browsers. Now
that the code works with the latest standards, I don't have to worry about new
browsers any longer. As long as browsers support the standards, my pages will
be displayed correctly.
According to the W3C, which is responsible for designing and maintaining these
standards, HTML documents are supposed to be structured around headings, paragraphs,
lists, and other paragraph items. Unfortunately, many times an HTML page is
designed to mimic print by using FONT tags to control character formatting,
BR tags to control line breaks, and non-breaking spaces to control layout. While
pages designed this way will often be displayed correctly, maintenance is harder
and changes can use a large part of the budget. For example, if all headings
have been manually formatted (using FONT tags) as bold navy sans serif, a change
requires manually changing every heading to the new style characteristics. When
headings are formatted with a Cascading Style Sheet (CSS) definition and tagged
appropriately with <h1>, a change requires a modification to the CSS.
Not only is this quicker and easier, but it's also much cheaper.
What Are Web Standards?
The W3C follows a process called the "Recommendation track" to build consensus
for a Web technology. The steps in the Recommendation track are:
- Working Draft
- Candidate Recommendation
- Proposed Recommendation
- W3C Recommendation
A Recommendation is the end result of the process, considered appropriate for
widespread deployment by the W3C. In other words, a Recommendation is a Web
You can find the complete list of Recommendations at http://www.w3.org/TR/#Recommendations,
as well as details on any Recommendation. Start at the list of Recommendations
and follow the links. Later Recommendations usually build on earlier ones. You
can also see the list of Proposed Recommendations, Candidate Recommendations,
Working Drafts, and Notes and learn about the process that the W3C follows when
considering a new standard.
While all Recommendations are important, this article focuses on the standards
for XHTML and Cascading Style Sheets (CSS).
XHTML became the standard in 2000 and several new Recommendations have been
released since then. The advantage of XHTML is its portability: valid XHTML
files can be viewed on any Internet device. This means that, unlike the days
of Windows CE Help, we don't need a special compiler for our information to
display on a selected device.
XHTMLTM 1.0 The Extensible HyperText Markup Language (Second Edition, 26 January
2000; revised 1 August 2002)
From the Abstract at http://www.w3.org/TR/xhtml1/, XHTML 1.0 is the "reformulation
of HTML 4 as an XML 1.0 application, and three DTDs corresponding to the
ones defined by HTML 4." And, XHTML 1.0 is "a reformulation of the three HTML 4
document types as applications of XML 1.0 [XML]." According to Jeffrey
Zeldman (A List Apart), "XHTML is XML that browsers think is HTML."
XHTML introduces the concept of "well-formedness." With HTML 4.0, authors could
pretty much do whatever they wanted: uppercase or lowercase tags (or a mixture
of both), attributes could be quoted or not, empty elements were allowed, and
more. XHTML 1.0 requirements are stricter, requiring lowercase tags, quoted
attributes, closed empty elements, and more.
XHTML 1.0 includes three Document Type Definitions (DTD):
- Strict You cannot use any presentation elements, such as "align"
within the image tag.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
- Transitional You may use some presentation elements. Transitional
is a bit easier for those already familiar with HTML and CSS, as it allows for
some flexibility when creating HTML pages.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
- Frameset Use with frames.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
XHTML 1.0 Transitional is the most commonly used DTD. See Creating Valid HTML/XHTML
for more information.
XHTMLTM Basic (19 December 2000)
From the Abstract at http://www.w3.org/TR/xhtml-basic/, "The XHTML Basic document
type includes the minimal set of modules.... It is designed for Web clients that
do not support the full set of XHTML features; for example, Web clients such
as mobile phones, PDAs, pagers, and settop boxes. The document type is rich
enough for content authoring."
XHTML Basic does not support the style, script, and noscript elements or frames,
as these items may not be applicable to small Internet devices.
XHTMLTM 1.1 - Module-based XHTML (31 May 2001)
From the Abstract at http://www.w3.org/TR/xhtml11/, "The purpose of this document
type is to serve as the basis for future extended XHTML 'family' document types,
and to provide a consistent, forward-looking document type cleanly separated
from the deprecated, legacy functionality of HTML 4 [HTML4]
that was brought forward into the XHTML 1.0 [XHTML1]
XHTML 1.1 is portable to different client devices, such as PDAs and cell phones,
as well as browsers. Many deprecated elements and attributes have been removed
from XHTML 1.1, because it relies on style sheets for presentation.
Cascading Style Sheets Standards
Cascading Style Sheets have been a Recommendation since 1996, although browser
support has been slow to catch up. (See Working with Various Browsers.) By using
CSS, you separate content and presentation, which allows users to control how
Web pages appear and also makes updating the presentation much easier. CSS3,
currently a Working Draft, separates the various aspects of CSS into separate
modules, which should make it easier for browsers and developers to support
Cascading Style Sheets Level 1 (17 December 1996, revised 11 January 1999)
From the Abstract at http://www.w3.org/TR/REC-CSS1, "This document specifies
level 1 of the Cascading Style Sheet mechanism (CSS1). CSS1 is a simple style
sheet mechanism that allows authors and readers to attach style (e.g., fonts,
colors and spacing) to HTML documents. The CSS1 language is human readable and
writable, and expresses style in common desktop publishing terminology."
The CSS1 Recommendation included the "cascade" feature and defined how conflicts
between various style sheets were resolved.
Cascading Style Sheets Level 2 (12 May 1998)
From the Abstract at http://www.w3.org/TR/REC-CSS2/, "This specification defines
Cascading Style Sheets, level 2 (CSS2). CSS2 is a style sheet language that
allows authors and users to attach style (e.g., fonts, spacing, and aural cues)
to structured documents (e.g., HTML documents and XML applications). By separating
the presentation style of documents from the content of documents, CSS2 simplifies
Web authoring and site maintenance."
CSS2 includes all the functionality of CSS1 and also supports media-specific
style sheets, allowing authors to customize the presentation for various devices,
paged media, relative and absolute positioning, and more.
Standards are great, but they aren't equally supported in the various browsers.
Check the table below to see how Cascading Style Sheets are supported in each
browser. For more information, see CSS2 Tests,
which compares various CSS2 properties against:
- Netscape 4
- Internet Explorer 4 and 5 (Windows and Mac)
- Internet Explorer 5.5 and 6 (Windows)
- Opera 5, 6, and 7
Konqueror lists its CSS2 support at http://www.konqueror.org/content/khtml_css2.html.
So what do you do?
First, decide which browsers you want to support. According to Vincent Flanders
in Chapter 4 of Son of Web Pages That Suck, "The correct answer is that your
site should look good in all browsers. The second-best correct answer is that
you should support the browsers your visitors use."
The hardest part for Help authors to understand is that "look good" doesn't mean "look identical." We very carefully check our HTML output in as many browsers as possible, tweaking to make sure that it looks the same. I've been guilty of furthering this by presentating sessions at WinWriters Annual Conferences that talk about how to accommodate the browser differences. However, instead of spending hours tweaking the code, we can now spend our time creating content.
If a target browser has partial or buggy support for one of the current CSS
standards, it doesn't mean that CSS doesn't work at allit means
that some parts of the specification don't work. Even Netscape 4 gets a few
Internet Explorer tends not to be a good test browser, as it doesn't demand
complianceit displays content that won't display in a compliant
browser. For example, if you forget the closing table tag, the page appears
in Internet Explorer as if the close tag was present. When testing your pages,
use a compliant browser, such as Mozilla, Opera, or Netscape 6.
IE4 (partial), 5 (partial and buggy), 5.5 (buggy), 6
N4 (buggy), 6, 7
Opera 4 (partial), 5 (Mac), 6 (Windows), 7 (still in Beta)
Mac 4.x (buggy)
Mac IE 5
IE 5.5 (weak), 6
N6 (partial, Mac and Windows), 7
Opera 4 (partial), 5 (Mac), 6 (Windows), 7 (still in Beta)
Mozilla 1 (limited)
Mac IE 5 (partial)