net.sf.tomp.xml.include
Class XIncludeFilter

java.lang.Object
  extended bynet.sf.tomp.xtcl.filter.XTFilterImpl
      extended bynet.sf.tomp.xml.include.XIncludeFilter
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler, org.xml.sax.ext.LexicalHandler, net.sf.tomp.general.Parametrized, org.xml.sax.XMLFilter, org.xml.sax.XMLReader, net.sf.tomp.xtcl.filter.XTFilter

public class XIncludeFilter
extends net.sf.tomp.xtcl.filter.XTFilterImpl

Copyright 2003 Jan Pavlovic and Tomas Pitner. Masaryk University in Brno, Czech Republic All rights reserved. This code is based on the work copyrighted by Copyright 2001-2003 Elliotte Rusty Harold. All rights reserved.

This is a SAX filter which resolves all XInclude include elements before passing them on to the client application. Currently this class has the following known deviation from the XInclude specification:

  1. XPointer is not supported.

Extensions made by JP and TP:
the URL of the included TEXT document (i.e. if parse='text') can be in one of the following forms (lines are numbered starting at 1):

TO-DO: check whether the xml:base-s really work for file: URLs
TO-DO: check whether the xml:base-s really work for file: URLs

Extensions made by TP, Nov 2003:
Instead of specifying and start/end line and/or line count nummericaly, Java-style regexp patterns can be put in slashes instead of the first or last line number.

Furthermore, I would definitely use a new instance of this class for each document you want to process. I doubt it can be used successfully on multiple documents. Furthermore, I can virtually guarantee that this class is not thread safe. You have been warned.

Since this class is not designed to be subclassed, and since I have not yet considered how that might affect the methods herein or what other protected methods might be needed to support subclasses, I have declared this class final. I may remove this restriction later, though the use-case for subclassing is weak. This class is designed to have its functionality extended via a a horizontal chain of filters, not a vertical hierarchy of sub and superclasses.

To use this class:

  1. Construct an XIncludeFilter object with a known base URL
  2. Pass the XMLReader object from which the raw document will be read to the setParent() method of this object.
  3. Pass your own ContentHandler object to the setContentHandler() method of this object. This is the object which will receive events from the parsed and included document.
  4. Optional: if you wish to receive comments, set your own LexicalHandler object as the value of this object's http://xml.org/sax/properties/lexical-handler property. Also make sure your LexicalHandler asks this object for the status of each comment using insideIncludeElement before doing anything with the comment.
  5. Pass the URL of the document to read to this object's parse() method

e.g.


 XIncludeFilter includer = new XIncludeFilter(base);
 includer.setParent(parser);
 includer.setContentHandler(new SAXXIncluder(System.out));
 includer.parse(args[i]);
 
 

Version:
1.0, April 7, 2004
Author:
Elliotte Rusty Harold, Tomas Pitner, Jan Pavlovic

Field Summary
protected static java.lang.String REGEXP_DELIMITERS
          the delimiters for regular expressions used to specify the first or last line
protected static java.lang.String term1
          the number of the first line to be included is specified in the URL after the term1 String
protected static java.lang.String term2
          the number of lines to be included is specified in the URL after the term2 String
protected static java.lang.String term3
          the number of the last line to be included is specified in the URL after the term3 String
 
Fields inherited from class net.sf.tomp.xtcl.filter.XTFilterImpl
contentHandler, dtdHandler, entityResolver, errorHandler, lexicalHandler, locator, parent
 
Constructor Summary
XIncludeFilter()
           
 
Method Summary
 void endDocument()
           
 void endElement(java.lang.String uri, java.lang.String localName, java.lang.String qName)
           
 void endPrefixMapping(java.lang.String prefix)
           
protected static java.lang.String getFirstLineRegexp(java.lang.String url)
           This method reads URL and returns the regular expression to match on the first line.
protected static java.lang.String getLastLineRegexp(java.lang.String url)
           This method reads URL and returns the regular expression to match on the last line.
protected  int getLineBegin(java.lang.String url)
           This method reads URL and return the first line to read
protected  int getLineCount(java.lang.String url)
           This method reads URL and return the amount of line line to read
 java.lang.String getXIncludeNamespace()
           
 org.xml.sax.XMLReader getXMLReader(java.lang.String variant)
           
 void characters(char[] ch, int start, int length)
           
 void ignorableWhitespace(char[] ch, int start, int length)
           
protected  void includeXMLDocument(java.lang.String url, java.lang.String variant)
           This utility method reads a document at a specified URL and fires off calls to various ContentHandler methods.
 boolean insideIncludeElement()
           This utility method returns true if and only if this reader is currently inside a non-empty include element.
protected  boolean isLastLineNo(java.lang.String url)
           
 void processingInstruction(java.lang.String target, java.lang.String data)
           
 void setDocumentLocator(org.xml.sax.Locator locator)
           
 void skippedEntity(java.lang.String name)
           
 void startDocument()
           
 void startElement(java.lang.String uri, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts)
           
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
           
 
Methods inherited from class net.sf.tomp.xtcl.filter.XTFilterImpl
comment, endCDATA, endDTD, endEntity, error, fatalError, getContentHandler, getDocumentLocator, getDTDHandler, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getParent, getProperty, notationDecl, parse, parse, resolveEntity, setContentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setParameter, setParent, setProperty, setupParse, startCDATA, startDTD, startEntity, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REGEXP_DELIMITERS

protected static final java.lang.String REGEXP_DELIMITERS
the delimiters for regular expressions used to specify the first or last line

See Also:
Constant Field Values

term1

protected static final java.lang.String term1
the number of the first line to be included is specified in the URL after the term1 String

See Also:
Constant Field Values

term2

protected static final java.lang.String term2
the number of lines to be included is specified in the URL after the term2 String

See Also:
Constant Field Values

term3

protected static final java.lang.String term3
the number of the last line to be included is specified in the URL after the term3 String

See Also:
Constant Field Values
Constructor Detail

XIncludeFilter

public XIncludeFilter()
Method Detail

getXIncludeNamespace

public java.lang.String getXIncludeNamespace()

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)

insideIncludeElement

public boolean insideIncludeElement()

This utility method returns true if and only if this reader is currently inside a non-empty include element. (This is not the same as being inside the node set which replaces the include element.) This is primarily needed for comments inside include elements. It must be checked by the actual LexicalHandler to see whether a comment is passed or not.

Returns:
boolean

startElement

public void startElement(java.lang.String uri,
                         java.lang.String localName,
                         java.lang.String qName,
                         org.xml.sax.Attributes atts)
                  throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

endElement

public void endElement(java.lang.String uri,
                       java.lang.String localName,
                       java.lang.String qName)
                throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
                      throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

skippedEntity

public void skippedEntity(java.lang.String name)
                   throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

getFirstLineRegexp

protected static java.lang.String getFirstLineRegexp(java.lang.String url)

This method reads URL and returns the regular expression to match on the first line.

Parameters:
url - URL of the document that will be read
Returns:
the regular expression to match on the first line.

getLastLineRegexp

protected static java.lang.String getLastLineRegexp(java.lang.String url)

This method reads URL and returns the regular expression to match on the last line.

Parameters:
url - URL of the document that will be read
Returns:
the regular expression to match on the last line.

getLineBegin

protected int getLineBegin(java.lang.String url)

This method reads URL and return the first line to read

Parameters:
url - URL of the document that will be read
Returns:
int

getLineCount

protected int getLineCount(java.lang.String url)

This method reads URL and return the amount of line line to read

Parameters:
url - URL of the document that will be read
Returns:
int

isLastLineNo

protected boolean isLastLineNo(java.lang.String url)

includeXMLDocument

protected void includeXMLDocument(java.lang.String url,
                                  java.lang.String variant)
                           throws org.xml.sax.SAXException

This utility method reads a document at a specified URL and fires off calls to various ContentHandler methods. It's used to include files with parse="xml"

Parameters:
url - URL of the document that will be read
variant - DOCUMENT ME!
Throws:
org.xml.sax.SAXException - if the requested document cannot be downloaded from the specified URL.

getXMLReader

public org.xml.sax.XMLReader getXMLReader(java.lang.String variant)
                                   throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException


Copyright © 2004 Masaryk University in Brno, Faculty of Informatics, Czech Republic. All Rights Reserved.