3D PLM Enterprise Architecture |
Middleware Abstraction |
Fetching an External Entity with SAXUsing a SAX entity resolver to fetch an external entity |
Use Case |
AbstractThis article shows how to specify an entity resolver and how to use it with a SAX parser. Entity resolvers are invoked by the XML parser to obtain the source of external entities they cannot find by themselves. |
This use case shows how to specify an entity resolver and how to use it with a SAX parser. You will learn to create two different kinds of event handlers for SAX: error handlers are invoked to process errors raised by the parser if the XML input is not well-formed or is invalid; entity resolvers are invoked when the parser encounters an external entity reference it cannot find by itself. You will learn how to register your handlers with the parser. Finally, the use case will show you how to run the SAX parser configurated with your handlers to parse a file with validation.
[Top]
The CAAXMLSAXResolver Use Case is a use case of the CAAXMLParser.edu framework that illustrates XMLParser framework capabilities.
[Top]
This use case reads an existing XML file using SAX and validates it with a DTD stored in a database. The SAX parser has to use an entity resolver to fetch the DTD from the database. For the sake of simplicity, the database is just a directory somewhere on the disk. The system ID of the DTDs must begin with the string: "sql://". If the document is not well formed or invalid, the program displays an error message in the console.
[Top]
To launch CAAXMLSAXResolver, you will need to set up the build time environment, then compile CAAXMLSAXResolver along with its prerequisites, set up the run time environment, and then execute the use case [1].
The use case should be launched as follows from the command line:
CAAXMLSAXResolver <databasedir> <filepath>
where <databasedir>
is the path of the directory,
which contains the DTDs and <filepath>
is the path of
the XML file, which will be parsed.
A sample XML file and a sample DTD are provided with the use case. To use them, launch the following command from the command line:
Windows | CAAXMLSAXResolver
InstallRoot\OS\resources\xml\CAAXMLSAXResolver\database
InstallRoot\OS\resources\xml\CAAXMLSAXResolver\car.xml |
Unix | CAAXMLSAXResolver
InstallRoot/OS/resources/xml/CAAXMLSAXResolver/database
InstallRoot/OS/resources/xml/CAAXMLSAXResolver/car.xml |
where:
InstallRoot
is the directory in which you have
installed the run time part or the product lineOS
is the directory containing the installed code
aix_a
for 32-bit AIXhpux_b
for HP-UXsolaris_a
for Solarisintel_a
for 32-bit Windowswin_b64
for 64-bit Windows[Top]
The CAAXMLSAXResolver use case is made of several classes located in the CAAXMLSAXResolver.m module of the CAAXMLParser.edu framework:
Windows | InstallRootDirectory\CAAXMLParser.edu\CAAXMLSAXResolver.m\ |
Unix | InstallRootDirectory/CAAXMLParser.edu/CAAXMLSAXResolver.m/ |
where InstallRootDirectory
is the directory where the
CAA CD-ROM is installed.
[Top]
To create a SAX parser, create and register event handlers with this parser, and parse a file, there are six main steps:
# |
Step |
---|---|
1 | Implement a V5 Entity Resolver and Error Handler component |
2 | Create a V5 SAX Component |
3 | Create and Configure a V5 SAX Parser |
4 | Create the Entity Resolver and Error Handler Components and Register Them With the Parser |
5 | Parse the XML File |
6 | Manage Errors |
[Top]
The SAX API uses an event-oriented API to process XML documents. The
XML SAX parser reads XML documents sequentially and invokes callback
functions to indicate the XML construct it comes across. Each invocation
is called a SAX event. The SAX API defines V5 interfaces, which specify
the signature of the SAX callback functions and group them per theme.
The CATISAXEntityResolver interface defines the ResolveEntity
function. This function is invoked by the parser when it encounters an
external entity reference it cannot find by itself. The function must
fetch the external entity and return it to the parser in the form of a
SAX input source. Other SAX interfaces (CATISAXDTDHandler, CATISAXErrorHandler,
CATISAXDocumentHandler) define additional events. To make the
work easier for the developer, the SAX API provides a CATSAXHandlerBase
component, which already provides an empty implementation of all the SAX
interfaces. .
Therefore, to write a SAX document handler, all you need to do is to create a new V5 component which inherits from CATSAXHandlerBase and override the methods to answer to the events, which are relevant to your application. The following code declares and defines a CAAXMLSAXResolverHandlers V5 component which inherits from CATSAXHandlerBase and re-implements CATISAXEntityResolver and CATISAXErrorHandler.
// CAAXMLSAXResolverHandlers.h #include "CATSAXHandlerBase.h" class CAAXMLSAXResolverHandlers : public CATSAXHandlerBase { CATDeclareClass; public: ... // Override the default implementation of the // CATISAXEntityResolver interface. virtual HRESULT ResolveEntity( const CATUnicodeString & iPublicId, const CATUnicodeString & iSystemId, CATISAXInputSource_var& oInputSource); // Override part of the default implementation of the // CATISAXErrorHandler interface. virtual HRESULT Error( CATSAXParseException* iException); virtual HRESULT FatalError( CATSAXParseException* iException); ... }; |
// CAAXMLSAXResolverHandlers.cpp #include "CAAXMLSAXResolverHandlers.h" // Declare the class as a V5 component derived from CATSAXHandlerBase CATImplementClass( CAAXMLSAXResolverHandlers, Implementation, CATSAXHandlerBase, CATnull ); // Implement the CATISAXEntityResolver interface #include "TIE_CATISAXEntityResolver.h" TIE_CATISAXEntityResolver(CAAXMLSAXResolverHandlers); // Implement the CATISAXErrorHandler interface #include "TIE_CATISAXErrorHandler.h" TIE_CATISAXErrorHandler(CAAXMLSAXResolverHandlers); |
The next step is to provide an implementation for each of the SAX
events you want to catch. The following code shows how the ResolveEntity
event callback function is implemented. This method receives the name of
the external entity to resolve as a CATUnicodeString. In the
use case, the method will be invoked when the parser reads the document
type declaration an cannot find the entity called "sql://automotive.dtd"
.
<!DOCTYPE car SYSTEM "sql://automotive.dtd"> |
The method must return the source of the corresponding entity in the form of a CATISAXInputSource object. In this case, the entity consist of a DTD file stored in the database directory of the disk.
// CAAXMLSAXResolverHandlers.cpp HRESULT CAAXMLSAXResolverHandlers::ResolveEntity( const CATUnicodeString & iPublicId, const CATUnicodeString & iSystemId, CATISAXInputSource_var& oInputSource) { ... // All system IDs must begin with this prefix CATUnicodeString prefix = "sql://"; CATIXMLSAXFactory_var factory; HRESULT hr2 = ::CreateCATIXMLSAXFactory(factory); if (SUCCEEDED(hr2) && (factory != NULL_var)) { // Compute the file path where the DTD is located. CATUnicodeString filePath = _databaseDir; #ifdef _WINDOWS_SOURCE filePath.Append("\\"); #else // _WINDOWS_SOURCE filePath.Append("/"); #endif // _WINDOWS_SOURCE filePath.Append(iSystemId.SubString(prefix.GetLengthInChar(), iSystemId.GetLengthInChar() - prefix.GetLengthInChar())); // Create a SAX input source hr = factory->CreateInputSourceFromFile(filePath, "", oInputSource); } ... } |
To create an input source for this file, invoke the CreateInputSourceFromFile
of the CATIXMLSAXFactory object. To get a CATIXMLSAXFactory
object, use the global function CreateCATIXMLSAXFactory (more on
this in the next section).
[Top]
... // CAAXMLSAXResolverMain.cpp CATIXMLSAXFactory_var factory; HRESULT hr = ::CreateCATIXMLSAXFactory(factory); ... |
To work with SAX, you need to instantiate the V5 SAX component. The
V5 SAX component can be created by calling the CreateCATIXMLSAXFactory
global function. This function returns a V5 handler on the CATIXMLSAXFactory
interface, which is the main interface for the V5 SAX component. Using
this interface you will be able to create SAX1 and SAX2 parsers and to
create input source to feed XML to the parser. Note that the code above
does not specify the CLSID of the component to use, so the default SAX
component (XML4C3) will be used. See [3] and [4] if you want to use another V5 SAX component.
[Top]
CATISAXParser_var parser; hr = factory->CreateParser(parser); ... |
To create a SAX1 parser, one simply invokes the CreateParser
on the CATIXMLSAXFactory object. There are two kinds of SAX1
parsers, non-validating SAX1 parsers and validating SAX1 parsers. If you
do specify optional arguments to your CreateParser
call, a
validating parser will be created. See [3] and [4] if you want to use another V5 SAX component.
[Top]
The SAX1 parser created in the previous section is not yet usable as
it does not yet know any other objects to which it can send the events
it generates. The SAX1 parser can accept up to four event handlers (one
for each event interface), as shown in the diagram below.
... CAAXMLSAXResolverHandlers *handlersImpl = new CAAXMLSAXResolverHandlers(databaseDir); CATISAXEntityResolver_var entityHandler = handlersImpl; CATISAXErrorHandler_var errHandler = handlersImpl; handlersImpl->Release(); handlersImpl = NULL; |
To instantiate the entity resolver and the error handler you have
defined in the previous section, simply do a new
of the main
implementation class, then get interface handles of the right type on
the component.
... hr = parser->SetEntityResolver(entityHandler); ... hr = parser->SetErrorHandler(errHandler); ... |
To register your entityResolver handler, call the SetEntityResolver
method of the CATISAXParser interface. To register your error
handler, call the SetErrorHandler
method of the CATISAXParser
interface. Passing NULL_var
to these methods unregisters the
previously registered handlers.
[Top]
hr = parser->Parse(filePath); |
To parse the XML file, call the Parse
method of the CATISAXParser
interface. Pass the path of the file to read as a parameter. The method
will read the file from top to bottom and generate the corresponding
events, calling your event handlers for all the events you want to
manage.
[Top]
The XMLParser framework uses the HRESULT / CATError mechanism to
manage errors. Make sure to use the CATError::CATGetLastError
to obtain all the available error diagnostics when using XMLParser. More
information about V5 error management is available here [2]
and [4].
[Top]
This use case shows you how to parse XML documents using the SAX API.
[Top]
[1] | Building and Launching a CAA V5 Use Case |
[2] | Managing Errors Using HRESULT |
[3] | Using XML in V5 |
[4] | XML Tips and Tricks |
[Top] |
Version: 1 [May 2005] | Document created |
[Top] |
Copyright © 2005, Dassault Systèmes. All rights reserved.