Getting the root element
Every XML document must have a root element. That element is the starting point for accessing all the information within the document. For example, that snippet of a document has
The root
Getting the children
You can obtain an
To demonstrate:
Using
That code directly retrieves the current window manager name:
Just be careful about NullPointerExceptions if the document has not been validated. For simpler document navigation, future JDOM versions are likely to support XPath references. Children can get their parent using
Getting the element attributes
Those attributes are directly available on an
You can also retrieve the attribute as an
For convenience you can retrieve attributes as various primitive types.
You can retrieve the value as any Java primitive type. If the attribute cannot be converted to the primitive type, a
But sometimes an element can contain comments, text content, and child elements. It may even contain, in advanced documents, a processing instruction:
This isn't a big deal. You can retrieve text and children as always:
That keeps the standard uses simple. Sometimes as when writing output, it's important to get all the content of an
Each
Since the data often appears like a list of attributes, the
You can find PIs anywhere in the document, just like
PIs may reside outside the root
It's actually very common for PIs to be placed outside the root element, so for convenience, the
That allows the Cocoon parser to read the first
As you probably expect,
Namespaces are supported in JDOM using the helper class
When a child is in a namespace, you can retrieve it using overloaded versions of
If a
All the
Every XML document must have a root element. That element is the starting point for accessing all the information within the document. For example, that snippet of a document has
<root-demo>
as the root:<root-demo id="demo"> <description>Gotta fit servlets in somewhere!</description> <distributable/> </root-demo>root-demo is the root of this xml document.
The root
Element
instance is available on a Document
directly: Element webapp = doc.getRootElement();
Getting the children
You can obtain an
Element
's children with various methods. getChild()
returns null
if no child by that name exists. List getChildren(); // return all children List getChildren(String name); // return all children by name Element getChild(String name); // return first child by name
To demonstrate:
// Get a List of all direct children as Element objects List allChildren = element.getChildren(); out.println("First kid: " + ((Element)allChildren.get(0)).getName()); // Get a list of all direct children with a given name List namedChildren = element.getChildren("name"); // Get a list of the first kid with a given name Element kid = element.getChild("name");
Using
getChild()
makes it easy to quickly access nested elements when the structure of the XML document is known in advance. Given that XML: <?xml version="1.0"?> <linux:config> <gui> <window-manager> <name>Enlightenment</name> <version>0.16.2</version> </window-manager> <!-- etc --> </gui> </linux:config>
That code directly retrieves the current window manager name:
String windowManager = rootElement.getChild("gui") .getChild("window-manager") .getChild("name") .getText();
Just be careful about NullPointerExceptions if the document has not been validated. For simpler document navigation, future JDOM versions are likely to support XPath references. Children can get their parent using
getParent()
. Getting the element attributes
<table width="100%" border="0"> </table>
Those attributes are directly available on an
Element
. String width = table.getAttributeValue("width");
You can also retrieve the attribute as an
Attribute
instance. That ability helps JDOM support advanced concepts such as Attribute
s residing in a namespace. (See the section Namespaces later in the article for more information.) Attribute widthAttrib = table.getAttribute("width"); String width = widthAttrib.getValue();
For convenience you can retrieve attributes as various primitive types.
int width = table.getAttribute("border").getIntValue();
You can retrieve the value as any Java primitive type. If the attribute cannot be converted to the primitive type, a
DataConversionException
is thrown. If the attribute does not exist, then the getAttribute()
call returns null. Extracting element content
We touched on getting element content earlier, and showed how easy it is to extract an element's text content usingelement.getText()
. That is the standard case, useful for elements that look like this: <name>Enlightenment</name>
But sometimes an element can contain comments, text content, and child elements. It may even contain, in advanced documents, a processing instruction:
<table> <!-- Some comment --> Some text <tr>Some child</tr> <?pi Some processing instruction?> </table>
This isn't a big deal. You can retrieve text and children as always:
String text = table.getText(); // "Some text" Element tr = table.getChild("tr"); // <tr> child
That keeps the standard uses simple. Sometimes as when writing output, it's important to get all the content of an
Element
in the right order. For that you can use a special method on Element
called getMixedContent()
. It returns a List
of content that may contain instances of Comment
, String
, Element
, and ProcessingInstruction
. Java programmers can use instanceof
to determine what's what and act accordingly. That code prints out a summary of an element's content: List mixedContent = table.getMixedContent(); Iterator i = mixedContent.iterator(); while (i.hasNext()) { Object o = i.next(); if (o instanceof Comment) { // Comment has a toString() out.println("Comment: " + o); } else if (o instanceof String) { out.println("String: " + o); } else if (o instanceof ProcessingInstruction) { out.println("PI: " + ((ProcessingInstriction)o).getTarget()); } else if (o instanceof Element) { out.println("Element: " + ((Element)o).getName()); } }
Dealing with processing instructions
Processing instructions (often called PIs for short) are something that certain XML documents have in order to control the tool that's processing them. For example, with the Cocoon Web content creation library, the XML files may have cocoon processing instructions that look like this:<?cocoon-process type="xslt"?>
Each
ProcessingInstruction
instance has a target and data. The target is the first word, the data is everything afterward, and they're retrieved by using getTarget()
and getData()
. String target = pi.getTarget(); // cocoon-process String data = pi.getData(); // type="xslt"
Since the data often appears like a list of attributes, the
ProcessingInstruction
class internally parses the data and supports getting data attribute values directly with getValue(String name)
: String type = pi.getValue("type"); // xslt
You can find PIs anywhere in the document, just like
Comment
objects, and can retrieve them the same way as Comment
s -- using getMixedContent()
: List mixed = element.getMixedContent(); // List may contain PIs
PIs may reside outside the root
Element
, in which case they're available using the getMixedContent()
method on Document
: List mixed = doc.getMixedContent();
It's actually very common for PIs to be placed outside the root element, so for convenience, the
Document
class has several methods that help retrieve all the Document
-level PIs, either by name or as one large bunch: List allOfThem = doc.getProcessingInstructions(); List someOfThem = doc.getProcessingInstructions("cocoon-process"); ProcessingInstruction oneOfThem = doc.getProcessingInstruction("cocoon-process");
That allows the Cocoon parser to read the first
cocoon-process
type with code like this: String type = doc.getProcessingInstruction("cocoon-process").getValue("type");
As you probably expect,
getProcessingInstruction(String)
will return null if no such PI exists. Namespaces
Namespaces are an advanced XML concept that has been gaining in importance. Namespaces allow elements with the same local name to be treated differently because they're in different namespaces. It works similarly to Java packages and helps avoid name collisions.Namespaces are supported in JDOM using the helper class
org.jdom.Namespace
. You retrieve namespaces using the Namespace.getNamespace(String prefix, String uri)
method. In XML the following code declares the xhtml
prefix to correspond to the URL "http://www.w3.org/1999/xhtml". Then <xhtml:title> is treated as a title in the "http://www.w3.org/1999/xhtml" namespace. <html xmlns:xhtml="http://www.w3.org/1999/xhtml">
When a child is in a namespace, you can retrieve it using overloaded versions of
getChild()
and getChildren()
that take a second Namespace
argument. Namespace ns = Namespace.getNamespace("xhtml", "http://www.w3.org/1999/xhtml"); List kids = element.getChildren("p", ns); Element kid = element.getChild("title", ns);
If a
Namespace
is not given, the element is assumed to be in the default namespace, which lets Java programmers ignore namespaces if they so desire. Making a list, checking it twice
JDOM has been designed using theList
and Map
interfaces from the Java 2 Collections API. The Collections API provides JDOM with great power and flexibility through standard APIs. It does mean that to use JDOM, you either have to use Java 2 (JDK 1.2) or use JDK 1.1 with the Collections library installed. All the
List
and Map
objects are mutable, meaning their contents can be changed, reordered, added to, or deleted, and the change will affect the Document
itself -- unless you explicitly copy the List
or Map
first. We'll get deeper into that in Part 2 of the article. Exceptions
As you probably noticed, several exception classes in the JDOM library can be thrown to indicate various error situations. As a convenience, all of those exceptions extend the same base class,JDOMException
. That allows you the flexibility to catch specific exceptions or all JDOM exceptions with a single try/catch block. JDOMException
itself is usually thrown to indicate the occurrence of an underlying exception such as a parse error; in that case, you can retrieve the root cause exception using the getRootCause()
method. That is similar to how RemoteException
behaves in RMI code and how ServletException
behaves in servlet code. However, the underlying exception isn't often needed because the JDOMException
message contains information such as the parse problem and line number.
No comments:
Post a Comment