Introduction to XHTML: Differences with HTML 4.
February 2, 2000
Mandatory tags
- The <head> and <body> elements cannot be omitted.
Tag and attribute names must be written in lower-case.
Since XML is case-sensitive,
XHTML element and attribute names must be written in lowercase.
You can no longer get away with what many people used to do to improve
readability of code typing the element and attribute names in
uppercase and the values in lowercase.
Attribute values can be any case you want.
For example, the "#ffcc33" value below can also be written as "#FFCC33."
| HTML: |
XHTML: |
<TD BGCOLOR="#ffcc33"> |
<td bgcolor="#ffcc33"> |
Elements must nest, no overlapping
Most browsers don't care if you overlap elements.
For example, if you have a bold tag at the end of a paragraph,
it rarely matters whether you close the </b> first or the </p> first.
With XML and XHTML,
you need to unclose the tags in reverse order -
i.e. last opened - first closed.
| HTML: |
XHTML: |
| <p>Be <b>bold!</p></b> |
<p>Be <b>bold!</b></p> |
Although overlapping is also illegal in HTML,
it was widely tolerated in existing browsers.
An XHTML document must be well-formed XML.
It must conform to basic XML syntax.
If it does not, the XML parser does not have an obligation to continue
processing the document.
Unlike current HTML parsers,
an XML parser will not try to recover and "guess" what you meant if the
syntax is wrong.
All non-empty elements must be closed
All elements must be closed, explicitly or implicitly.
Many people used the <p> tag to separate paragraphs.
The <p> tag is designed to mark the beginning and end of a paragraph.
That makes it a "non-empty" tag since it contains the paragraph text.
| HTML: |
XHTML: |
First paragraph<p>
Second paragraph<p> |
<p>First paragraph</p>
<p>Second paragraph</p> |
Affected Elements:
<basefont>, <body>, <colgroup>, <dd>, <dt>, <head>, <html>,
<li>, <p>, <tbody>/<thead>/<tfoot>, <th>/<td>,
<tr>.
Empty elements must be terminated
Empty elements (i.e. those without a closing tag) have no content.
So while a <p> tag contains a paragraph,
and a <b> tag contains text to be bolded,
a <br> tag is "empty" because it never contains anything.
Other tags like this are <hr> and <img src="valid.gif">
All empty elements must use the XML "empty tag" syntax
with a trailing forward slash
("/") before the end bracket (eg. <br> becomes <br />).
Note the space after the element text and the />.
This is for compatibility with current browsers.
| HTML: |
XHTML: |
| <hr> |
<hr /> |
| <br> |
<br /> |
| <input ... > |
<input ... /> |
| <param ... > |
<param ... /> |
| <img src="valid.gif"> |
<img src="valid.gif" /> |
Affected Elements:
<area>, <base>, <br>, <col>, <frame>, <hr>, <img>,
<input>, <isindex>, <link>, <meta>, <option>, <param>.
Attribute values must be quoted
No more <img ... border=0>.
You now need to quote every attribute, even if it's numeric:
| HTML: |
XHTML: |
<img ... border=0> |
<img ... border="0" /> |
Attribute value pairs cannot be minimized
An attribute is minimized when there is only one possible value.
XML does not allow attribute minimization.
Stand-alone attributes must be expanded
(eg. <td nowrap>text</td> becomes
<td nowrap="nowrap">text</td>).
| HTML: |
XHTML: |
| <dl compact> |
<dl compact="compact"> |
| <ul compact> |
<ul compact="compact"> |
| <option ... selected> |
<option ... selected="selected"> |
| <td nowrap> text </td> |
<td nowrap="nowrap"> text </td> |
| <input type="radio" ... checked> |
<input type="radio" ... checked="checked" /> |
| <input type="checkbox" ... checked> |
<input type="checkbox" ... checked="checked" /> |
<script> and <style> elements
In XHTML,
the script and style elements are declared as having #PCDATA content.
As a result, < and & will be treated as the start of markup,
and entities such as < and &
will be recognized as entity references by the XML processor to <
and & respectively.
Wrapping the content of the script or style element within a CDATA
marked section avoids the expansion of these entities.
The only delimiter that is recognized in a CDATA is the "]]>"
string which ends the CDATA section.
| XHTML: |
<script language="JavaScript type="text/javascript">
<![CDATA[
document.write("<b>Hello World!</b>");
]]>
</script>
|
If your browser recognised the CDATA, you should see
Hello World! above.
You can avoid using CDATA by using
<script language="JavaScript" src="myscript.js"></script>
to read your script code off the server,
or by using <link href="mystylesheet.css" />
to load an external CSS file.
This only applies when you're putting code inside these files.
Introduction to XHTML: Document Type Definitions
Introduction to XHTML, with eXamples
Introduction to XHTML, with eXamples
|