2008-09-03

Namespaces in XSLT

XSLT tutorials and references usually pretend that namespaces don't exist. However, when trying to process an XML document (an XHTML document in this case) that declared a namespace, none of my element node references matched!

XSLT (rightly) distinguishes between element names with no namespace qualifier (which are processed as belonging in the "no namespace" namespace) and names with an explicit qualifier, which of course are treated as belonging to that namespace.

The solution is to

  • declare a prefix for the source document's namespace(s) in the XSLT document node, like (e.g.)

    <xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:xhtml="http://www.w3.org/1999/xhtml">

  • use these prefixes in your patterns, like (e.g.)

    <xsl:template match="xhtml:p">



My thanks to Jeni Tennison for explaining it very clearly in this post.

2008-09-02

Parsing XML with an Internet connection

I'm trying to use XSLT (XALAN-J in my case) to extract and re-format some information from some XML files. These files refer to some external entities such as the XHTML definition at http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd. The SAX parser, even with validation turned off, insists on going out on the Web to fetch that DTD and its brethren.

I'm at my workplace, and access to the Web is through a proxy. There are Java system properties I could set to give my application Web access, but I'd have to include my proxy user ID and password, and the external access might still slow my application down.

I tore my hair out for some time and Googling took a long time, this time, to bring me toward a solution.

The solution, in a nutshell, works like this:

  • Manually download all required files to a local directory that will later be included with the app;

  • Provide a catalog redirecting references from their customary URIs to local ones;

  • Include the Apache Commons Resolver library;

  • Provide a CatalogResolver.properties file to tell the Resolver where to find the catalog;

  • attach a new CatalogResolver() to the XML reader.


This is helpfully pointed out and very well explained here.