tomp.xml.include
Class XIncludeFilter

java.lang.Object
  extended bytomp.xtcl.filter.XTFilterImpl
      extended bytomp.xml.include.XIncludeFilter
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler, org.xml.sax.ext.LexicalHandler, Parametrized, org.xml.sax.XMLFilter, org.xml.sax.XMLReader, XTFilter

public class XIncludeFilter
extends XTFilterImpl

This is a SAX filter which resolves all XInclude include elements before passing them on to the client application. Currently this class has the following known deviation from the XInclude specification:

  1. XPointer is not supported.

Extensions made by JP and TP:
the URL of the included TEXT document (i.e. if parse='text') can be in one of the following forms (lines are numbered starting at 1):

TO-DO: check whether the xml:base-s really work for file: URLs
TO-DO: check whether the xml:base-s really work for file: URLs

Extensions made by TP, Nov 2003:
Instead of specifying and start/end line and/or line count nummericaly, Java-style regexp patterns can be put in slashes instead of the first or last line number.

Furthermore, I would definitely use a new instance of this class for each document you want to process. I doubt it can be used successfully on multiple documents. Furthermore, I can virtually guarantee that this class is not thread safe. You have been warned.

Since this class is not designed to be subclassed, and since I have not yet considered how that might affect the methods herein or what other protected methods might be needed to support subclasses, I have declared this class final. I may remove this restriction later, though the use-case for subclassing is weak. This class is designed to have its functionality extended via a a horizontal chain of filters, not a vertical hierarchy of sub and superclasses.

To use this class:

  1. Construct an XIncludeFilter object with a known base URL
  2. Pass the XMLReader object from which the raw document will be read to the setParent() method of this object.
  3. Pass your own ContentHandler object to the setContentHandler() method of this object. This is the object which will receive events from the parsed and included document.
  4. Optional: if you wish to receive comments, set your own LexicalHandler object as the value of this object's http://xml.org/sax/properties/lexical-handler property. Also make sure your LexicalHandler asks this object for the status of each comment using insideIncludeElement before doing anything with the comment.
  5. Pass the URL of the document to read to this object's parse() method

e.g.

XIncludeFilter includer = new XIncludeFilter(base); 
  includer.setParent(parser);
  includer.setContentHandler(new SAXXIncluder(System.out));
  includer.parse(args[i]);
  


Field Summary
protected static java.lang.String REGEXP_DELIMITERS
          the delimiters for regular expressions used to specify the first or last line
protected static java.lang.String term1
          the number of the first line to be included is specified in the URL after the term1 String
protected static java.lang.String term2
          the number of lines to be included is specified in the URL after the term2 String
protected static java.lang.String term3
          the number of the last line to be included is specified in the URL after the term3 String
static java.lang.String XINCLUDE_NAMESPACE
           
 
Fields inherited from class tomp.xtcl.filter.XTFilterImpl
contentHandler, dtdHandler, entityResolver, errorHandler, lexicalHandler, locator, parent
 
Constructor Summary
XIncludeFilter()
           
 
Method Summary
 void endDocument()
          Filter an end document event.
 void endElement(java.lang.String uri, java.lang.String localName, java.lang.String qName)
          Filter an end element event.
 void endPrefixMapping(java.lang.String prefix)
          Filter an end Namespace prefix mapping event.
protected static java.lang.String getFirstLineRegexp(java.lang.String url)
           This method reads URL and returns the regular expression to match on the first line.
protected static java.lang.String getLastLineRegexp(java.lang.String url)
           This method reads URL and returns the regular expression to match on the last line.
protected  int getLineBegin(java.lang.String url)
           This method reads URL and return the first line to read
protected  int getLineCount(java.lang.String url)
           This method reads URL and return the amount of line line to read
protected  org.xml.sax.XMLReader getXMLReader(java.lang.String variant)
           
 void characters(char[] ch, int start, int length)
          Filter a character data event.
 void ignorableWhitespace(char[] ch, int start, int length)
          Filter an ignorable whitespace event.
protected  void includeXMLDocument(java.lang.String url, java.lang.String variant)
           This utility method reads a document at a specified URL and fires off calls to various ContentHandler methods.
 boolean insideIncludeElement()
           This utility method returns true if and only if this reader is currently inside a non-empty include element.
protected  boolean isLastLineNo(java.lang.String url)
           
 void processingInstruction(java.lang.String target, java.lang.String data)
          Filter a processing instruction event.
 void setDocumentLocator(org.xml.sax.Locator locator)
          Filter a new document locator event.
 void skippedEntity(java.lang.String name)
          Filter a skipped entity event.
 void startDocument()
          Filter a start document event.
 void startElement(java.lang.String uri, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts)
          Filter a start element event.
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
          Filter a start Namespace prefix mapping event.
 
Methods inherited from class tomp.xtcl.filter.XTFilterImpl
comment, endCDATA, endDTD, endEntity, error, fatalError, getContentHandler, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getParent, getProperty, notationDecl, parse, parse, resolveEntity, setContentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setParameter, setParent, setProperty, setupParse, startCDATA, startDTD, startEntity, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REGEXP_DELIMITERS

protected static final java.lang.String REGEXP_DELIMITERS
the delimiters for regular expressions used to specify the first or last line

See Also:
Constant Field Values

term1

protected static final java.lang.String term1
the number of the first line to be included is specified in the URL after the term1 String

See Also:
Constant Field Values

term2

protected static final java.lang.String term2
the number of lines to be included is specified in the URL after the term2 String

See Also:
Constant Field Values

term3

protected static final java.lang.String term3
the number of the last line to be included is specified in the URL after the term3 String

See Also:
Constant Field Values

XINCLUDE_NAMESPACE

public static final java.lang.String XINCLUDE_NAMESPACE
See Also:
Constant Field Values
Constructor Detail

XIncludeFilter

public XIncludeFilter()
Method Detail

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
Description copied from class: XTFilterImpl
Filter a new document locator event.

Specified by:
setDocumentLocator in interface org.xml.sax.ContentHandler
Overrides:
setDocumentLocator in class XTFilterImpl
Parameters:
locator - The document locator.

insideIncludeElement

public boolean insideIncludeElement()

This utility method returns true if and only if this reader is currently inside a non-empty include element. (This is not the same as being inside the node set which replaces the include element.) This is primarily needed for comments inside include elements. It must be checked by the actual LexicalHandler to see whether a comment is passed or not.

Returns:
boolean

startElement

public void startElement(java.lang.String uri,
                         java.lang.String localName,
                         java.lang.String qName,
                         org.xml.sax.Attributes atts)
                  throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a start element event.

Specified by:
startElement in interface org.xml.sax.ContentHandler
Overrides:
startElement in class XTFilterImpl
Parameters:
uri - The element's Namespace URI, or the empty string.
localName - The element's local name, or the empty string.
qName - The element's qualified (prefixed) name, or the empty string.
atts - The element's attributes.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

endElement

public void endElement(java.lang.String uri,
                       java.lang.String localName,
                       java.lang.String qName)
                throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter an end element event.

Specified by:
endElement in interface org.xml.sax.ContentHandler
Overrides:
endElement in class XTFilterImpl
Parameters:
uri - The element's Namespace URI, or the empty string.
localName - The element's local name, or the empty string.
qName - The element's qualified (prefixed) name, or the empty string.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a start document event.

Specified by:
startDocument in interface org.xml.sax.ContentHandler
Overrides:
startDocument in class XTFilterImpl
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter an end document event.

Specified by:
endDocument in interface org.xml.sax.ContentHandler
Overrides:
endDocument in class XTFilterImpl
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a start Namespace prefix mapping event.

Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
startPrefixMapping in class XTFilterImpl
Parameters:
prefix - The Namespace prefix.
uri - The Namespace URI.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
                      throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter an end Namespace prefix mapping event.

Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
endPrefixMapping in class XTFilterImpl
Parameters:
prefix - The Namespace prefix.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a character data event.

Specified by:
characters in interface org.xml.sax.ContentHandler
Overrides:
characters in class XTFilterImpl
Parameters:
ch - An array of characters.
start - The starting position in the array.
length - The number of characters to use from the array.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter an ignorable whitespace event.

Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Overrides:
ignorableWhitespace in class XTFilterImpl
Parameters:
ch - An array of characters.
start - The starting position in the array.
length - The number of characters to use from the array.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a processing instruction event.

Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Overrides:
processingInstruction in class XTFilterImpl
Parameters:
target - The processing instruction target.
data - The text following the target.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

skippedEntity

public void skippedEntity(java.lang.String name)
                   throws org.xml.sax.SAXException
Description copied from class: XTFilterImpl
Filter a skipped entity event.

Specified by:
skippedEntity in interface org.xml.sax.ContentHandler
Overrides:
skippedEntity in class XTFilterImpl
Parameters:
name - The name of the skipped entity.
Throws:
org.xml.sax.SAXException - The client may throw an exception during processing.

getFirstLineRegexp

protected static java.lang.String getFirstLineRegexp(java.lang.String url)

This method reads URL and returns the regular expression to match on the first line.

Parameters:
url - URL of the document that will be read
Returns:
the regular expression to match on the first line.

getLastLineRegexp

protected static java.lang.String getLastLineRegexp(java.lang.String url)

This method reads URL and returns the regular expression to match on the last line.

Parameters:
url - URL of the document that will be read
Returns:
the regular expression to match on the last line.

getLineBegin

protected int getLineBegin(java.lang.String url)

This method reads URL and return the first line to read

Parameters:
url - URL of the document that will be read
Returns:
int

getLineCount

protected int getLineCount(java.lang.String url)

This method reads URL and return the amount of line line to read

Parameters:
url - URL of the document that will be read
Returns:
int

isLastLineNo

protected boolean isLastLineNo(java.lang.String url)

includeXMLDocument

protected void includeXMLDocument(java.lang.String url,
                                  java.lang.String variant)
                           throws org.xml.sax.SAXException

This utility method reads a document at a specified URL and fires off calls to various ContentHandler methods. It's used to include files with parse="xml"

Parameters:
url - URL of the document that will be read
Returns:
void
Throws:
org.xml.sax.SAXException - if the requested document cannot be downloaded from the specified URL.

getXMLReader

protected org.xml.sax.XMLReader getXMLReader(java.lang.String variant)
                                      throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException