Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


How to Create a Vocabulary

July 24, 2000

So, how do you go about defining a new vocabulary? There are actually three ways, but the traditional way is to create a Document Type Definition (DTD). A DTD defines a language by listing the elements that are permitted. Every markup language has one. Well-written HTML files contain a declaration that tells the browser which DTD to use. For example:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

This is a fixed, public DTD which defines version 4.0 of HTML.

XML lets you create your own DTDs, using Declaration Syntax. Here's an example, quoted from Peter Flynn's excellent XML FAQ:

<!ELEMENT List (Item)+>
<!ELEMENT Item (#PCDATA)>

This fragment defines a list as an element type containing one or more items (that's the plus sign), and items as element types containing just text (Parsed Character Data, ie text with no more markup left in it).

The example shows a section of a DTD which defines two elements: <LIST>, which can contain one or more ITEMs, and <ITEM>, which may contain only text. If we call this DTD "example", and store it here at the WDVL (this is only an example, not a real DTD), then an XML document based on this DTD would begin like this:

<?xml version="1.0"?>
<!DOCTYPE Example SYSTEM "http://wdvl.com/dtds/example.dtd">

A fragment of an XML document using these elements might look like this:

<LIST>
<ITEM>
Apple
</ITEM>
<ITEM>
Pear
</ITEM>
<ITEM>
Banana
</ITEM>
</LIST>

Looks like HTML, doesn't it? Note, however, that unlike HTML, XML requires closing tags for all non-empty elements. Also, remember that DTDs are written in Declaration Syntax, while XML documents are written in Instance Syntax. Yes, it does get a bit complicated.

DTDs can be very lengthy, as they must specify every single element that can be used. Knowing that many Web developers are too lazy to write a proper DTD, the XML holy men provided for a way to create XML documents without one.

An XML document with no DTD is referred to as "DTDless", while a document that does have a DTD is referred to (somewhat confusingly) as "valid". A DTDless document must begin with a Standalone Document Declaration (SDD), which declares that it is a "standalone" or DTDless document, for example:

<?xml version="1.0" standalone="yes"?>

When a client encounters a DTDless XML document, it must infer the meaning of each element from its position and usage. In the above LIST and ITEM example, a reasonably intelligent browser should be able to figure out that a LIST is meant to contain ITEMs, simply from the fact that ITEMs are nested within the LIST.

The rules for writing DTDless documents differ somewhat from those for writing documents that do have DTDs. Also, some features of XML are not available with DTDless documents.

As powerful as DTDs are, we endlessly creative (or is it "pain in the neck"?) developers have already come up against some limitations, leading to the development of a more sophisticated tool called a "schema".

As a markup language, XML does not define any data types - anything contained within an element is interpreted as simple text. While a DTD specifies what is valid syntax, it cannot specify criteria for what is valid content.

For example, consider an email address, which must contain both a "@" (which our German-speaking friends call a "monkey's tail") and a "." In order to minimize errors, a software application (or a Web page with scripting) can include a validation routine which gives the user an error message if she or he enters a string that doesn't contain both of these items. Similar validation routines exist for phone numbers, zip codes, and any other type of data that has a certain required format.

The XML Schema proposal allows data types to be specified for the content of particular elements, so that validation routines can be used to reduce errors. Another advantage of Schemas is that they are written in XML Instance Syntax, eliminating the need for Declaration Syntax. Schemas are as yet only a "proposal", not a formal "recommendation", and it remains to be seen whether they will replace DTDs or be used as a complement to them.

Examples of XML Vocabularies
Building Languages with XML
Why create a new vocabulary?


Up to => Home / Authoring / Languages / XML / Tutorials / Building_Languages




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers