What's It All About, Ælfred?
April 12, 1999
What's it all about? Well, it's all about creating your own
markup language that follows some very basic rules so that
lightweight
XML parsers
can split the file into tokens (small pieces)
2:
-
elements
(which most of us think of incorrectly as tags)
- attributes (parameters or settings of elements that qualify
them) and the values of the attributes
-
processing instructions,
such as:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
-
entities
- content (the text or data that is marked up); also called
character data
See Selena's
XML tutorial, part 2
for the details about these pieces and their syntax.
We've discussed some of the
parser issues
before, such as event-based vs. tree-based parsing and
validating vs. non-validating parsers.
The oldest XML APIs is
SAX,
the Simple API to XML parsers. This free API is the basis for the
Ælfred XML parser and
applet.
Follow the first link below if you want to download this
lightweight XML parser. If you just want to try
Ælfred as an applet, follow the second link; please do
this now. In the applet case, you'll be parsing a specific poem
that has an associated poetry DTD (the XML and the DTD are
available from the second link).
|
Ælfred XML parser and applet |
Small XML parser built on SAX that can be used online with a
fixed file, or downloaded for use with other inputs. Useful
as a lightweight applet. |
|
http://www.microstar.com/aelfred.html
|
XML parser requires
JRE or JDK.
|
http://www.microstar.com/aelfred/browser-test.html
(this link will appear in another window) |
Try this now! Applet requires Java-capable browser
(works with Netscape 4.x and Internet Explorer 4.x and 5.x;
should work with Netscape 3.x and IE 3.x with proper Java
support).
|
Here's the initial output after you press the applet's
"Parse donne.xml" button. While
looking at the input XML document in another window,
see if you can match the entities, elements, character data,
and (further down) an attribute for line one as you examine the
Ælfred output. (If you're not using IE5, you'll be
prompted to download donne.xml as an unknown XML file type;
if you are using IE5, the file will be displayed as a tree
with collapsible nodes.)
Start document
Resolving entity: pubid=null,
sysid=http://www.microstar.com/aelfred/donne.xml
Starting external entity: http://www.microstar.com/aelfred/donne.xml
Resolving entity: pubid=-//Megginson//DTD Simple Poem//EN,
sysid=http://www.microstar.com/aelfred/poem.dtd
Starting external entity: http://www.microstar.com/aelfred/poem.dtd
Ending external entity: http://www.microstar.com/aelfred/poem.dtd
Doctype declaration: poem, pubid=-//Megginson//DTD Simple Poem//EN,
sysid=poem.dtd
Start element: name=poem
Ignorable whitespace: "\n\n"
Start element: name=front
Ignorable whitespace: "\n"
Start element: name=title
Character data: "Elegy XIX: To His Mistress Going to Bed"
End element: title
Ignorable whitespace: "\n"
Start element: name=author
Character data: "John Donne, d.1631"
End element: author
[...output deleted...]
Start element: name=stanza
Ignorable whitespace: "\n"
Attribute: name=n, value=1 (specified)
Start element: name=line
Character data: "Come, Madam, come, all rest my powers defy,"
End element: line
What do you mean you're not impressed? You want to see something
more exciting? You've seen lots of Dynamic HTML flying around
web pages, not to mention clever little Java applets, and you've
become jaded? Well, we can't promise you anything as flashy as
DHTML, but it does get better than just parsing, believe me.
On the other hand, it might be a good idea to keep things in
perspective as you read this article. Just keep reciting this
mantra:
XML is not about presentation. It's about structure!
Corollary mantra (to be chanted when reading Part 2 of this
article next month):
Well, sometimes XML is about presentation when you reference
CSS or XSL.
2Of course that's really not
all it's about. Parsing is just the most basic aspect
of processing XML. For the importance of XML, see our earlier
article,
XML: Structuring Data for the Web: an Introduction.
What We'll Be Doing With XML
Doing It With XML, Part 1
Writing Our Own ML without a Net: Examples
|