HTML
December 6, 1999
|
Reprinted from
What is a Markup Language - March 8, 1999
HTML,
as its name implies, is a markup language. As such, it is
used to markup text. But what exactly does it mean to
markup text?
Abstractly, marking up text is a methodology for
encoding data with information about itself. Examples of
markups (encoded data) are ubiquitous in the real world.
For example, back when you were slogging through high school,
you probably used to use a bright yellow highlighter pen to
highlight sentences in your schoolbooks (or at least you knew
someone who did!). You did so because you thought that the
highlighted sentences would be useful to review around exam
time and you wanted a quick way to skim through the important
points. Just like you, thousands of kids around the world
did the exact same thing for the exact same reason.
By highlighting certain bits of text, you were effectively
"marking-up" the data. Essentially, you specified that
certain sentences (data) were important by marking them in
yellow. These sentences became encoded with the
fact that they were important.
And what's more, since everyone followed the same
standard of marking up, you could easily pick up a used text
book and get a good idea just from reading the highlighted
sections what were core points of the book.
There are two crucial points to take away from this example.
For markups to transmit useful information about data to a
pool of users...
- a standard must be in place to define what a valid markup is -
In the example above, markup is defined as a bit of yellow
ink atop text. In HTML a markup is a tag.
- a standard must be in place to define what markup means -
In the example above, a yellow highlight means the highlighted
text represents an important point. In HTML each tag
communicates its own layout of formatting meaning.
Markups are also ubiquitous in the world of computers. They
are used by word processors to specify formatting and layout,
by communications programs to express the meaning of data
sent over the wires, by database applications that must
associate meaning and relationships with the data they serve,
and by multimedia processing programs which must express
meta-data about images or sound.
As data is sent through dumb computers and programs, it is
essential that the data carries with it information necessary
to communicate what the data means and/or what the receiver
should do with that data.
Data with no context is meaningless just as an unhighlighted
book is bad news around exam time!
HTML
is one of the more famous computer markup systems. HTML
defines a set of tags that associate formatting rules with
bits of text. Documents which have been marked up (which
contain plain text as well as the
tags
that specify
the rules for formatting that text) are read by an HTML
processing application (a
web browser
for example) that knows how to display the
text according to the rules.
For example, the <B> tag specifies a rule which
instructs an HTML processing application to bold a specific
bit of text. Similarly, the <CENTER> tag instructs
the HTML processing application to center the text.
Thus <CENTER><B>BOLD</B></CENTER>
would be displayed by an HTML processing application as:
BOLD
You might imagine a client contact list which could look like
the following bit of HTML code:
<UL>
<LI>Gunther Birznieks
<UL>
<LI>Client ID: 001
<LI>Company: Bob's Fish Store
<LI>Email: gunther@bobsfishstore.com
<LI>Phone: 662-9999
<LI>Street Address: 1234 4th St.
<LI>City: New York
<LI>State: New York
<LI>Zip: 10024
</UL>
<LI>Susan Czigany
<UL>
<LI>Client ID: 002
<LI>Company: Netscape
<LI>Email: susan@eudora.org
<LI>Phone: 555-1234
<LI>Street Address: 9876 Hazen Blvd.
<LI>City: San Jose
<LI>State: California
<LI>Zip: 90034
</UL>
</UL>
The above HTML-encoded data would be displayed by an HTML
processing application as:
- Gunther Birznieks
- Client ID: 001
- Company: Bob's Fish Store
- Email: gunther@bobsfishstore.com
- Phone: 662-9999
- Street Address: 1234 4th St.
- City: New York
- State: New York
- Zip: 10024
- Susan Czigany
- Client ID: 002
- Company: Netscape
- Email: susan@eudora.org
- Phone: 555-1234
- Street Address: 9876 Hazen Blvd.
- City: San Jose
- State: California
- Zip: 90034
|
|
Is HTML a Programming Language?
Actually, though HTML is often called a programming language
it is really not. Programming languages are
'Turing-complete', or 'computable'. That is, programming
languages can be used to compute something such as the square
root of pi or some other such task. Typically programming
languages use conditional branches and loops and operate on
data contained in abstract data structures. HTML is much
easier than all of that. HTML is simply a 'markup language'
used to define a logical structure rather than compute
anything. It is sort've a semantic issue, but it is one
which you should officially be aware of.
|
The language itself is fairly simple and follows a few
important standards.
Firstly, document description is defined by "HTML tags" that
are instructions embedded within a less-than (<) and a
greater-than (>) sign. To begin formatting, you specify a
format type within the < and the >. Most tags in
HTML are ended with a similar tag with a slash in it to
specify an end to the formatting. For example, to emphasize
some text, you would use the following HTML code:
this text is not bold
<EM>this text is bold</EM>
this text is not bold
It is important to note that the formatting codes within an
HTML tag are case-insensitive. Thus, the following two
versions of the bold tag would both be understood by a web
browser:
<em>this text is bold</em>
this text is not
<EM>this text is bold</EM>
You can also compound formatting styles together in HTML.
However, you should be very careful to "nest" your code
correctly. For example, the following HTML code shows correct
and incorrect nesting:
<CENTER><EM>this text is bolded and centered
correctly</EM></CENTER>
<EM><CENTER>this text is bolded and centered
incorrectly</EM></CENTER>
In the incorrect version, notice that the bold tag was closed
before the center tag, even though the bold tag was opened
first. The general rule is that tags on the inside should
be closed before tags on the outside.
Finally, HTML tags can not only define a formatting option,
they can also define attributes to those options as well.
To do so, you specify an attribute and an attribute value
within the HTML tag. For example, the following tag creates
a heading style aligned to the left:
<H2 ALIGN = "LEFT">this text has a heading
level two style and is
aligned to the left </H2>
There are a few things to note about attributes however.
First, it is not necessary to enclose attribute values within
quotes unless white space is included in the value.
Secondly, it is not necessary to have a space before or after
the equal sign that matches an attribute to its value.
Finally, when you close an HTML tag with an attribute,
you should not include attribute information in the closing.
Finally, you should know that web browsers do not care about
white space that you use in your HTML document. For example,
the following two bits of HTML will be displayed the exact
same way:
This is some text that is displayed
as you would expect
This is some text
that is displayed in a way
you
would not expect:
exactly the same as the above
Next month we'll continue our look at the web technologies
that define, describe, or standardize the basic
characteristics of the data.
Additional Resources:
Raw Data
Introduction to the Web Application Development Environment (Tools)
|