Discussion Forums HTML, XML, JavaScript... |
 |
Software Reviews Editors,Others... |
 |
Top100 JavaScript Tutorials, ... |
 |
Tutorials ASP, CSS, Databases... |
Discussion List FAQ, Roundup, Configure ... |
 |
Authoring HTML, JavaScript, CSS... |
 |
Design Layout, Navigation,... |
 |
Graphics Tools, Colors, Images...
|
 |
Software Browsers, Editors, XML...
|
 |
Internet Domains, E-Commerce, ... |
 |
WDVL Resources Intermdiate, Tutorials,... |
 |
WDVL Discussion Lists, Top 100,... |
 |
| Technology Jobs |
 |
|
XML Software Guide: XML Parsers
July 5th 1998
Last Modified: July 29, 2000
Unlike many XML authoring tools, most XML parsers are free. XML
parsers come in two flavors:
- non-validating: the parser
does not check a document against any DTD (Document Type Definition);
only checks that the document is well-formed
(that it is properly markedup according to XML syntax rules)
- validating: in addition to checking
well-formedness, the parser verifies that the document conforms to
a specific DTD (either internal or external to the XML file
being parsed).
If you are planning to write your own DTD, a validating parser
would be the more desirable. Free validating parsers are available
from companies such as IBM, Microsoft, DataChannel, and Textuality.
In the lists of validating and
non-validating parsers below, we
present the parsers written in Java first since they should work on
all Java platforms. Each section also lists several parsers written
in other languages such as C, perl and Python.
The final section lists several services and tools for
checking and/or validating your XML.
Java fans should read the February 1999
Java Report's article on XML Parsers
which compares 7 of the parsers listed in this section.
- Xerces
- The Apache XML Project is
maintaining XML parsers in Java,
C++, and
Perl
[free product from Apache.org; all Java, C++, and perl platforms]
-
IBM's XML Parser for Java
- Also known as
XML4J.
Version 1 of IBM's XML Parser for Java was the highest rated
Java XML parser in Java Report's February 1999 review of XML parsers.
Version 2 adds these exciting new features:
Configurable, Modular Architecture; High Performance; Revalidation; and XCatalog Support.
Support for XML 1.0, DOM 1.0 and SAX 1.0 is also included.
XML4J 3.0.1 is based on the Apache Xerces XML Parser Version 1.0.3.
New features include experimental versions of DOM Level 2, SAX2 (beta 2),
and parts of W3C Schema.
See also the numerous IBM alphaWorks freeware in our
"Specialized XML Software" section.
[free product from IBM; all Java platforms]
- JavaSoft's XML Parser
- See Java Project X in the API section of this article for Sun's entry into the world of XML parsers.
[free product from JavaSoft; all Java platforms]
- Oracle XML Parser
- Oracle released its XML Parser for Java, a standalone XML component that enables parsing
of XML documents through either SAX or DOM interfaces using validating or non-validating
modes. See also the Oracle XML site.
[free product from Oracle; all Java platforms]
- XMLBooster
- XMLBooster generates XML parsers for COBOL, C, Java, etc.
According to the company, XMLBooster is said to "achieve performance
comparable with message-specific hand-written parsers by skipping the
intermediate step where the message is turned into a generic DOM tree
using a generic parser which must take the entire generality of XML into
account and support every feature, no matter how obscure. The parsers
generated by XMLBooster only recognize the XML features required to
parse the message at hand, and produces directly a parser that
initializes application-level data structures without going through any
time-consuming intermediate representation. Tool features: (1)
Generates parsers, which are between 5 and 45 times faster than generic
parsers (2) Produce parsers in C, COBOL, Delphi and Java (3) Produces
working data structures in the host language, rather than a dynamic and
poorly typed generic tree (4) The XML message to parse can come from a
file, a message, a socket, a data structure, etc. (5) Produce naturally
validating parsers, far beyond the validation possibilities of DTDs."
[commerical product for C, COBOL, Delphi, Java]
- SXP, the Silfide XML Parser
-
The Silfide XML Parser (SXP) is a parser and a complete XML API in
Java. It is part of XSilfide, a client/server based environment.
XSilfide includes SIL, the Silfide Interface Language, among other things.
"The SIL DTD is organized using modules, gathering (1) the encoding of
the user workspace (2) the encoding of the user informations (3) the
extended query language and (4) the encoding of the queries result set."
[free product from Silfide; all Java platforms]
-
MSXML
- Microsoft's XML parser in Java is included in IE4. The version
presently available predates the final XML 1.0 spec by one month.
"The parser checks for well-formed documents and optionally
permits checking of the documents' validity. Once parsed, the XML
document is exposed as a tree through a simple set of Java methods,
which [Microsoft is] working with the World Wide Web Consortium
(W3C) to standardize. These methods support reading and/or writing
XML structures..." See sample
parsing of an XML file using JScript. (Microsoft also includes
an
XML parser in
C++ in IE4 which is "a high-performance, non-validating
parser, [that] supports most of the W3C XML specification".)
[free product from Microsoft; all Java
platforms; all IE4 platforms]
-
DXP
- Note: DXP is now a
team development effort with Microsoft
and will be included in Internet Explorer 5.0.
"The DataChannel XML Parser (DXP) is a validating XML
parser written in Java. DXP is specifically aimed at providing a
utility for server-side applications that need to integrate XML
capabilities into existing systems and for out-of-the-browser
Java-based software. DXP provides the highly sophisticated
error-checking mechanisms required for XML-based data interchange.
DXP has not been architected for usage in an applet context,
downloaded via the Internet." DXP is based on NXP (Norbert's
XML Parser), one of the earliest XML parsers. See also our
DataChannel entry in the
"Specialized XML Software"
section of this article. [free? product
from DataChannel and Microsoft; all Java platforms]
- Larval
- Larval is Tim Bray's validating XML processor built on the same
code base as
Lark (below).
"Larval is a full validating XML processor; it reports
violations of validity constraints, but does not apply draconian
error handling to them."
[freeware by Tim Bray (Textuality); all
Java platforms; see Lark below]
-
Near & Far Designer
- According to Microstar, "Near & Far Designer is the
ideal tool for those who are new to structured information as well
as those who are already achieving the benefits of structured
information. DTDs can be created and modified graphically without
prior knowledge of XML/SGML language syntax. With the intuitive
tree representation a DTD can be created from scratch or imported,
reworked and exported as a revised DTD. Structures can be explored
to any level of detail. The drag and drop interface makes working
with DTDs easy."Read
review from
XMLXperts, Ltd
of converting
SGML to XML using Near & Far Designer .
[commercial product from Microstar
Software Ltd.; Windows only]
-
XML::Parser
- This perl-based XML parser is from Larry Wall, the creator of
perl. Some of the parsing code is based on James Clark's expat
(below). At this time, there is no documentation or description;
the link is for downloading. [freeware
from Larry Wall; Perl]
-
xmlproc
- "xmlproc is an XML parser written in Python. It is a
fairly complete validating parser, but does not do everything
required of a validating parser, or even a well-formedness parser.
The average user should not run into any omissions, though. Later
releases will be more complete."
freeware by Lars Marius Garshol; Python]
- TclXML
- This XML parsing package requires Tcl 8.0b1 or a later version.
Last updated 19th June 1997. It is possible that something newer
will come from
Zveno.
[freeware; Tcl]
-
Lark
- Lark is a non-validating Java XML processor by Tim Bray, one of
the authors of the W3C XML spec. It implements all of the XML 1.0
Recommendation and reports violations of well-formedness.
[freeware by Tim Bray (Textuality); all
Java platforms; see also Larval above]
- XP
- James Clark's XML Parser in Java, complete with
javadoc
documentation. "XP is an XML 1.0 parser written in Java. It
is fully conforming: it detects all non well-formed documents. It is
currently not a validating XML processor. However it can parse all
external entities: external DTD subsets, external parameter entities
and external general entities. " XP is a high performance
parser intended for use with Java applications, rather than applets.
It includes
a SAX
driver implementation. (In addition to expat [below] and XP, James
Clark also has developed
SP, a free, object-oriented
toolkit for SGML parsing and entity management; SP can parse XML and
can convert SGML to XML. ) [freeware from
James Clark; all Java application platforms]
-
Ælfred
- According to Microstar, Ælfred is "a small, fast,
DTD-aware Java-based XML parser, especially suitable for use in Java
applets. We've designed Ælfred for Java programmers who want to
add XML support to their applets and applications without doubling
their size: Ælfred consists of only two core class files, with
a total size of about 26K, and requires very little memory to run.
There is also a complete SAX (Simple API for XML) driver available
in this distribution for interoperability." Note that
Microstar also has a commercial XML authoring tool called Near and
Far Designer (above). [freeware
from Microstar Software Ltd.; all Java applet platforms]
-
HEX
- HEX is the HTML Enabled XML Parser. It is "simple, 100%
Java, non-validating XML parser with some hooks for more-or-less
correct parsing of most HTML pages. It doesn't understand either
SGML or XML DTD's but the parser API allows the application to
control its operation in ways that facilitate HTML parsing.
" HEX includes an implementation of
SAX.
HEX also implements the Java binding for the
DOM
core level one as per the March 1998 Working Draft.
[freeware by Anders Kristen, HP Labs;
all Java platforms]
- expat
- XML Parser Toolkit is James Clark's library for XML parsing in
C. Expat (formerly called xmltok) is being used to add support for
XML to Netscape Navigator 5 and Perl. Expat aims to be a fully
conforming XML 1.0 parser and is written in C.
- HXA (Hubrick's XML Analyzer)
- Hubick's XML Analyzer "is a pure Java tool built
upon a low level XML parser (HXP) which breaks an XML file down into
it's constituent productions for analysis. HXA allows one to examine the
production hierarchy for any character in an XML document or document
fragment. For easy reference HXA also provides links from each
production in the analysis to its corresponding section in the XML
specification."
[freeware for all
Java platforms; may require Microsoft Internet Explorer]
- LT XML
- "LT XML is an integrated set of XML tools and a
developers' tool-kit, including a C-based API...The LT XML tool-kit
includes stand-alone tools for a wide range of processing of
well-formed XML documents, including searching and extracting,
down-translation (e.g. report generation, formatting), tokenising
and sorting. Sequences of tool applications can be pipelined together
to achieve complex results.... It also includes a powerful, yet
simple, querying language, which allows the user to quickly and
easily select those parts of an XML document which are of
interest." The parser produces either a textual
view or a tree view of an XML document.
[freeware from the Language Technologies Group; C language; Unix
and Win32 platforms]
- xmlib
- Python 1.5.1 contains this version of xmllib.py by Sjoerd
Mullender. [freeware; all Python
platforms]
- Xparse
- "Xparse is a fully compliant well-formed XML parser written
in less than 5k of JavaScript." The author, Jeremie (no last
name visible), plans to add DOM support when DOM becomes a W3C
Recommendation. There is also a web page for
trying the parser
without downloading. See also
Sparse,
the XSL companion to Xparse.
[freeware; all JavaScript platforms]
XML Software Guide: XML and XSL Editors
XML Software Guide
XML Software Guide: XML Browsers
|
|