Jumbo9801a
This document describes the alpha "snapshot" (i.e. release) of JUMBO in
Jan 1998.
Description
JUMBO is an element-oriented system for processing XML documents. It can
read and parse (with/without additional parsers, with/without the SAX interface).
It creates a tree or elements and attributes with various types of content.
It also supports processing instructions (PIs) in a generic manner. There
is support for namespaces and XSL stylesheets, though JUMBO does not have
sophisticated rendering. It has a browsing model based on a tree/TOC model,
event streams or customised element display. It supports (SIMPLE) XLL navigation
including NEW and REPLACE and most Xpointer syntax. It extends the latter
to provide sophisticated search and navigation tools for the document.
JUMBO also provides authoring and editing facilities, driven by DTD information
where possible. These can be customised to provide novel types of data
input other than text.
JUMBO is designed to be extended, especially through subclassing or
elements, and I hope that a collaborative community (cf. tcl/tk, LaTeX,
Linux) will develop for its future support. Offers are very welcome here.
Main Features
-
JUMBO is 100% pure Java (1.02) and runs as an applet or application.
-
JUMBO does not knowingly deviate from the X*L specs, apart from
known limitations.
-
JUMBO has an elementary XML parser, sufficient for its own configuration
files. JUMBO has been developed to be used with the SAX
API so that any SAX-J-compliant parser (1998-01-28: AElfred, Lark,
MSXML, NXP, (XP not yet done)) can be used at runtime. Parsers can be selected
in the commandline or through menus.
-
The parse result is treated as a tree and displayed on a tableOfContents
(TOC). This allows access to all main components (elements, attributes,
content, PIs)
-
Components are rendered as: subtrees/TOCs; event streams (text, tagged
text and others); and individual objects. Fonts can be selected.
-
JUMBO menus are driven by (internal) XML documents which include HTML-based
help on a per-item basis
-
The JUMBO GUI has several components allowing assessment of the document
and its processing including error announcement (Draconian).
-
Xpointers (XLL) are implemented for: linking into subcomponents of XML
documents; searching XML documents; internal management of XML documents
(e.g. menus, stylesheets, namepsaces)
-
User-based searching is through an interface which allows boolean combination
of strict XPointer addressing. Hits are highlighted in the TOC. X*L-specific
tools (Find IDs, NAMEs) are included.
-
XLL is implemented for SIMPLE. NEW and REPLACE are implemented; EMBED is
on a per-application basis. AUTO and USER are implemented. (JUMBO extensions
can link into non-XML documents).
-
New XML (and non-XML files) can be read into JUMBO under menu control.
-
The current tree (possibly modified) can be saved as XML. Window components
can be saved as GIFs
-
There are a variety of options for browsing elements, attributes, PCDATA
and whitespace.
-
Two display options (TOC and TOC+object) can be chosen - more will follow.
Objects can be displayed in individual Frames.
-
JUMBO allows import of non-XML documents by setting MIME types and requiring
per-MIME conversion code. The conversion is done on-the-fly.
-
JUMBO supports some non-SAX information on a per-parser basis. This includes
DTD components such as ATTLISTs and ELEMENT contentDeclarations
-
JUMBO can be used for editing existing documents,
sometimes with primitive DTD or schema-based control. It can also be used
for creating new documents.
-
JUMBO has an experiemental approach towards namespaces
-
Stylesheets: JUMBO
is tracking the public XSL spec and can read XSL documents
-
JUMBO is easily extended to provide support for Java-based applications
on a package/namespace basis. The following are currently available:
-
jumbo.sgml.html (HTML V2.0). This supports well-formed HTML at
about V2.0 level (but no tables or forms). Rendering is readable but not
optimised for performance or beauty. JUMBO-HTML is included in the alpha
distribution.
-
jumbo.tecml (Technical Markup Language). This is aimed at technical
and scientific applications provides strong data typing (FLOAT, DATE,
etc.) with UNITS and structuring (ARRAY and LIST).
Some commonly used data types are also included: BIB, PERSON, FIGURE,
etc. NOT included in alpha distribution
-
jumbo.cml (Chemical MarkupLanguage). This provides support for
molecular applications. NOT included in alpha distribution
-
jumbo.chemime (Chemical MIME). Classes to convert non-XML files
(chemical/x-*) into XML trees on the fly. NOT included in alpha
distribution
-
jumbo.vhg (Virtual HyperGlossary). Support for XML-based terminology
including ISO12620 terms. NOT included in alpha distribution
Installation
JUMBO9801a is available at http://www.vsms.nottingham.ac.uk/vsms/java/jumbo/jan9801.
Details of installation are available; it will
be useful to install one or more SAX-compliant (http://www.microstar.com/xml/sax)
parsers.
Copyright, Collaboration, Source, Warranty
JUMBO is copyright Peter Murray-Rust. It is available without fee, but
may not be redistributed or used for commercial purposes or teaching without
permission. It is my intention that JUMBO is freely available for personal
use by individuals and for personal use within organisations at present.
Class libraries will be available on the WWW. I hope to develop a LaTeX/tcl-like
club of collaborators and the precise nature of future copyright will depend
on that; I would like to wrelax the restrictions above. I am reluctant
at present to make source freely available except to collaborators since
(through experience) I fear the distribution of mutants and the misappropriation
of authorship. Constructive suggestions would be welcomed here. If a stable
core can be communally developed (like tcl)
I would feel more relaxed. So, if you are seriously interested in helping
give me a mail with details.
No guarantee is made of JUMBO's fitness for any purpose and the author
is not responsible for any damage caused by whatever means.
Copyright Peter Murray-Rust, 1996, 1997, 1998