Oracle-Developer.com
| Navigation:
Home | Discussion Forums (Get expert advice) |
Scripts |
About Us | Links | Job Openings
|
Oracle XML DB for Wireless
By V.J.
Jain, January 2007 (varun.jain@varunjaininc.com)
Varun Jain, Inc. (an Oracle Partner)
|
Oracle 11g was officially released this month and Oracle users have many reasons to be excited. One of these reasons is the recent enhancements to XML DB. For users who are tasked with managing very complex XML files, XML DB in 11g offers a full solution that is might be perfect for your organization. Recently, the trend in wireless is for all vendors to begin using standardized XML formats for data transmission. The XML schemas that define these files are generally very large, complex, and frequently changing while the scale of the data is tremendous. As organizations adapt to these new XML standards, they must choose a technology to manage this content. Oracle XML DB in 11g provides a solution that successfully meets the challenges of managing complex, dynamic, high scale XML content.
Oracle XML DB is an extremely useful technology that allows management of XML content in the database. XML DB offers different storage options to meet your specific situation and requirement. In previous releases, these options were either unstructured (CLOB) or structured storage (schema-based). With the introduction of 11g, binary XML storage has been added as an additional storage option. Binary XML is a format designed specifically for XML data and is compact and schema aware. Due to the volume of wireless data, the storage format should be structured XML storage.
Performance data is usually characterized by small but many files while configuration data is characterized as few but very large files (50-750MB). I have found that structured storage is almost always the best solution since wireless data is usually very content rich and comes in very high volumes.
Why is XML DB better for wireless data than a custom parser or a commercial off the shelf (COTS) ETL product?
This is literally the million dollar question. What advantage does Oracle XML DB have over a custom coded XML parser? Why should you invest in Oracle technologies for your content management instead of a company whose primary focus is ETL (such as Informatica)? To understand the answer, you need to understand wireless data.
Wireless data is content rich, highly dynamic, recursive, very high volume data that is a force to be reckoned with for any developer. Generally, the XML schemas for wireless data are very large and define many more elements than are actually being used in any file. Within the files, it is common to have many recursive references and self-referencing elements. As for scale, either the files are very large or there are very many of them. New versions of the XML schema are released often and it is not surprising to find hundreds of changes to each XML schema with every release. To make matters even worse, most vendors are divided by region and it is typical to have different regions on different versions at any given time. All of these characteristics play a part in adding to the complexity of any feasible solution.
In my experience, I have encountered UMTS specifications from certain vendors which use 3GPP standards. In addition to the 36 standard complex types and 108 standard elements, their vendor specified elements included over 600 complex types with over 1,800 elements. Each new release would add or remove hundreds of elements. Intensifying the problem, the different regions would have different versions of the XML schema at any given time.
With XML DB with Oracle 11g each of these concerns is addressed surprisingly well. Before discussing how XML DB meets these complex requirements, let’s consider the alternatives. A custom coded parser would be able to extract data for a given path. The problem with using custom code is that you would need to define the X-Path that you want to extract. The Wireless XML Schema definitions allow for such a high degree of flexibility and recursion that it is impossible to account for all of the possible X-Paths. Even if you manage to clearly define the paths that will be used, if new metrics are turned on, resulting in a new set of X-Paths within the same schema definition, you would need to update the custom code and then reparse all of the parsed data. This solution is not easy to manage since the XML schemas are designed to allow the optional use of almost all the elements within the definition. Since most of the elements are also recursive, the chance of collecting all possible paths within the same definition is very unlikely.
What about tools such as COTS tools? These tools are very useful when the standards for the file formats can be guaranteed. However, in the world of Wireless this is not usually the case. The COTS tools will have the same limitations as custom code since you will be required to manually change the extraction rules for any updates. In addition to maintenance, your solution using a COTS tool will be bounded by the performance limitations of that tool. Since wireless data is a unique form of non-transactional metrics, the commercial tools are likely to perform poorly. Since I have not evaluated each available tool, I cannot say with certainty whether the tool you are considering will work acceptably or not. However, my recommendation is that you try to use Oracle’s XML DB first since this technology has been included in the database and does not require an additional license.
So how do you use XML DB to meet your wireless content management requirements? One of the main problems with the other technologies is that the schema definition is very flexible and new X-paths can be used at any time within the same XML schema. With XML DB, this is not an issue since the XML files are loaded directly into the database. Once the file is in the object-relational structure, the any X-path can be queried without requiring any re-parsing. This allows the support for dynamic metrics and measurements within the schema definition. This also offers great handling for content rich files since the contents of the entire XML file is available by SQL queries.
Consider an XML Schema such as - 3GPP TS 32.615 v4.2 http://www.arib.or.jp/IMT-2000/V310Sep02/S3g/Rel4/32/32615-420.pdf
configData.xsd <?xml version="1.0" encoding="UTF-8"?> <schema xmlns:bc="configData.xsd" xmlns:xn="genericNrm.xsd" xmlns:es="VendorSpecificAttributes.1.0.xsd" xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="configData.xsd" elementFormDefault="qualified" attributeFormDefault="unqualified" xmlns:xdb="http://xmlns.oracle.com/xdb" xdb:storeVarrayAsTable="true"> <!-- Configuration data file root XML element --> <import namespace="genericNrm.xsd" schemaLocation="genericNrm.xsd"/> <import namespace=" VendorSpecificAttributes.1.0.xsd" schemaLocation=" VendorSpecificAttributes.1.0.xsd"/> <element name="bulkCmConfigDataFile" xdb:defaultTable="XML_DEFAULT"> <complexType> <sequence> <element name="fileHeader"> <complexType> <attribute name="fileFormatVersion" type="string" use="required"/> <attribute name="senderName" type="string" use="optional"/> <attribute name="vendorName" type="string" use="optional"/> </complexType> </element> <element name="configData" maxOccurs="unbounded"> <complexType> <choice> <element ref="xn:SubNetwork"/> <element ref="xn:MeContext"/> <!-- <element ref="xn:ManagedElement"/> Not Possible at this level in Ericsson Model --> </choice> <attribute name="dnPrefix" type="string" use="optional"/> </complexType> </element> <element name="fileFooter"> <complexType> <attribute name="dateTime" type="dateTime" use="required"/> </complexType> </element> </sequence> </complexType> </element> </schema>
genericNrm.xsd <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:un="utranNrm.xsd" xmlns:gn="geranNrm.xsd" xmlns:xn="genericNrm.xsd" targetNamespace="genericNrm.xsd" elementFormDefault="qualified" attributeFormDefault="unqualified"> <import namespace="geranNrm.xsd" schemaLocation="geranNrm.xsd"/> <import namespace="utranNrm.xsd" schemaLocation="utranNrm.xsd"/> <!-- Abstract base type for all NRM class associated XML elements --> <complexType name="NrmClassXmlType" abstract="true"> <attribute name="id" type="string" use="required"/> <attribute name="modifier" use="optional"> <simpleType> <restriction base="string"> <enumeration value="create"/> <enumeration value="delete"/> <enumeration value="update"/> </restriction> </simpleType> </attribute> </complexType> <!-- Generic Network Resources IRP NRM class associated XML elements --> <element name="SubNetwork"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="userDefinedNetworkType" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:SubNetwork"/> <element ref="xn:ManagedElement"/> <element ref="xn:MeContext"/> <element ref="xn:ManagementNode"/> <element ref="xn:IRPAgent"/> <element ref="un:ExternalUtranCell"/> <element ref="gn:ExternalGsmCell"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="ManagedElement"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="managedElementType" minOccurs="0"/> <element name="userLabel" minOccurs="0"/> <element name="vendorName" minOccurs="0"/> <element name="userDefinedState" minOccurs="0"/> <element name="locationName" minOccurs="0"/> <element name="swVersion" minOccurs="0"/> <element name="managedBy" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="un:RncFunction"/> <element ref="un:NodeBFunction"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="MeContext"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType/> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:ManagedElement"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="ManagementNode"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="vendorName" minOccurs="0"/> <element name="userDefinedState" minOccurs="0"/> <element name="locationName" minOccurs="0"/> <element name="manages" minOccurs="0"/> <element name="swVersion" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:IRPAgent"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="IRPAgent"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="systemDN" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:NotificationIRP"/> <element ref="xn:AlarmIRP"/> <element ref="xn:BulkCmIRP"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="NotificationIRP"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="irpVersion" minOccurs="0"/> </all> </complexType> </element> </sequence> </extension> </complexContent> </complexType> </element> <element name="AlarmIRP"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="irpVersion" minOccurs="0"/> </all> </complexType> </element> </sequence> </extension> </complexContent> </complexType> </element> <element name="BulkCmIRP"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="irpVersion" minOccurs="0"/> </all> </complexType> </element> </sequence> </extension> </complexContent> </complexType> </element> <element name="VsDataContainer"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="vsDataType" minOccurs="0"/> <element name="vsDataFormatVersion" minOccurs="0"/> <element ref="xn:vsData" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <complexType name="vsData" abstract="true"/> <!-- VsDataContainer NRM class vsData attribute associated empty XML element --> <element name="vsData" type="xn:vsData" abstract="true"/> </schema>
geranNrm.xsd <schema xmlns:gn="geranNrm.xsd" xmlns:un="utranNrm.xsd" xmlns:xn="genericNrm.xsd" xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="geranNrm.xsd" elementFormDefault="qualified" attributeFormDefault="unqualified"> <!-- GERAN Network Resources IRP NRM class associated XML elements --> <import namespace="genericNrm.xsd" schemaLocation="genericNrm.xsd"/> <import namespace="utranNrm.xsd" schemaLocation="utranNrm.xsd"/> <element name="GsmRelation"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="adjacentCell" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="ExternalGsmCell"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="cellIdentity" minOccurs="0"/> <element name="bcchFrequency" minOccurs="0"/> <element name="ncc" minOccurs="0"/> <element name="bcc" minOccurs="0"/> <element name="lac" minOccurs="0"/> <element name="mcc" minOccurs="0"/> <element name="mnc" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> </schema>
utranNrm.xsd <schema xmlns:gn="geranNrm.xsd" xmlns:un="utranNrm.xsd" xmlns:xn="genericNrm.xsd" xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="utranNrm.xsd" elementFormDefault="qualified" attributeFormDefault="unqualified"> <import namespace="genericNrm.xsd" schemaLocation="genericNrm.xsd"/> <import namespace="geranNrm.xsd" schemaLocation="geranNrm.xsd"/> <element name="RncFunction"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="mcc" minOccurs="0"/> <element name="mnc" minOccurs="0"/> <element name="rncId" minOccurs="0"/> <!-- <element name="mncLength" minOccurs="0"/> --> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="un:UtranCell"/> <element ref="un:IubLink"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="NodeBFunction"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="nodeBFunctionIubLink" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="UtranCell"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="cId" minOccurs="0"/> <element name="localCellId" minOccurs="0"/> <element name="uarfcnUl" minOccurs="0"/> <element name="uarfcnDl" minOccurs="0"/> <element name="primaryScramblingCode" minOccurs="0"/> <element name="primaryCpichPower" minOccurs="0"/> <element name="maximumTransmissionPower" minOccurs="0"/> <element name="primarySchPower" minOccurs="0"/> <element name="secondarySchPower" minOccurs="0"/> <element name="bchPower" minOccurs="0"/> <element name="lac" minOccurs="0"/> <element name="rac" minOccurs="0"/> <element name="sac" minOccurs="0"/> <element name="utranCellIubLink" minOccurs="0"/>
</all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="un:UtranRelation"/> <element ref="gn:GsmRelation"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="IubLink"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="iubLinkUtranCell" minOccurs="0"/> <element name="iubLinkNodeBFunction" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="UtranRelation"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="adjacentCell" minOccurs="0"/> <!-- <element name="relationType" minOccurs="0"/>
--> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> <element name="ExternalUtranCell"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="cId" minOccurs="0"/> <element name="mcc" minOccurs="0"/> <element name="mnc" minOccurs="0"/> <element name="rncId" minOccurs="0"/> <element name="uarfcnUl" minOccurs="0"/> <element name="uarfcnDl" minOccurs="0"/> <element name="primaryScramblingCode" minOccurs="0"/> <element name="primaryCpichPower" minOccurs="0"/> <element name="lac" minOccurs="0"/> <element name="rac" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element> </schema>
VendorSpecificAttributes.1.0.xsd Not shown.
After adding annotations, register the schemas (here’s the method using the definition files in the repository with the trace on)
Once the XML schemas have been registered, you will be able to insert the XML file using one of the insertion methods (FTP, BFILE insert, and repository). I prefer using the repository.
After the file has been successfully inserted, data can be queried by using XML/SQL or X-Query on the default table, XML_DEFAULT. For example, to extract the information stored in the file header and footer,
As demonstrated, once the file is loaded into the database, we are able to query any subset of the available data. This is very useful when the XML schema is content rich and can contain many elements that we are not interested in. It is also very useful since different business users might be interested in different sets of values within the same XML file. Once the file is loaded, the users are able to query any subset of data that is available.
One of the limitations of XML DB in 10g was the schemaEvolve() procedure. This 10g procedure would allow changes to the XML schema definition at a significant cost. The process would lock the resources, create temporary tables, move all of the data from the XML table into the temporary tables, apply the schema changes, and then move the data back to the XML table. Given the volume of Wireless metrics, these XML tables would be very large tables. Depending on the size of the table, it might take hours to complete an evolution to add a single element. Newly introduced in 11g, inPlaceEvolve allows this same evolution to be an online operation that does not require data movement. This enhancement is crucial in addressing the frequently changing XML schema definitions in the realm of Wireless technologies.
Consider the situation when the XML table defined above has been loading files for several weeks. There might be hundreds of files consuming hundreds of gigabytes of disk space. Adding a simple element to one of the complex types in the schema definition would be a very costly task. In each version, we need to provide a list of the old XSD files and a corresponding list of new XSD files. For example, if we wanted to change the definition of UtranCell in utranNrm.xsd as such…
utranNrmRevised.xsd <element name="UtranCell"> <complexType> <complexContent> <extension base="xn:NrmClassXmlType"> <sequence> <element name="attributes" minOccurs="0"> <complexType> <all> <element name="userLabel" minOccurs="0"/> <element name="cId" minOccurs="0"/> <element name="localCellId" minOccurs="0"/> <element name="uarfcnUl" minOccurs="0"/> <element name="uarfcnDl" minOccurs="0"/> <element name="primaryScramblingCode" minOccurs="0"/> <element name="primaryCpichPower" minOccurs="0"/> <element name="maximumTransmissionPower" minOccurs="0"/> <element name="primarySchPower" minOccurs="0"/> <element name="secondarySchPower" minOccurs="0"/> <element name="bchPower" minOccurs="0"/> <element name="lac" minOccurs="0"/> <element name="rac" minOccurs="0"/> <element name="sac" minOccurs="0"/> <element name="utranCellIubLink" minOccurs="0"/> <!-- Add uraList for new version --> <element name="uraList" type="string" minOccurs="0"/> </all> </complexType> </element> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="un:UtranRelation"/> <element ref="gn:GsmRelation"/> <element ref="xn:VsDataContainer"/> </choice> </sequence> </extension> </complexContent> </complexType> </element>
If the new files begin to include this previously undefined field, then the files will not be validated against the XML schema and they will not be inserted into our schema based default table. In 10g, the changes would be implemented by placing the new XSD into the repository and issuing the following
The command will not finish until all the files from the XML table are copied into temporary objects, the changes are applied to the XML schema, and then all the data is copied back into the XML table. This process can take hours just to add one simple element.
Starting in Oracle 11g, this same modification can be made without requiring any data movement.
Procedure inPlaceEvolve has these three parameters. (schemaURL IN VARCHAR2, diffXML IN XMLType, flags IN NUMBER := EVOLVE_PRESERVE_VALIDITY | EVOLVE_TRACE_ONLY);
SchemaURL is the XML schema you want to modify. diffXML is a specially formatted document that reflects the changes in the XML schema. The last is a flag controls the behavior of the procedure with respect to verifying the validity of documents. Here’s a sample of the script that dynamically generates the diffXML document and performs the inPlaceEvolve.
(Another great example is available at http://forums.oracle.com/forums/thread.jspa?threadID=318465)
The introduction of this new feature makes it feasible to manage large volume, frequently changing data without necessitating a large window for modifying the definition.
In conclusion, Oracle XML DB offers XML content management that is far more useful and practical than some of its alternatives. The approach of using object-relational storage greatly benefits XML schemas where the definition allows great flexibility in the documents. The new ability to modify the schema definition efficiently makes XML DB an ideal solution for highly dynamic content. These features meet the unique challenges of managing Wireless transmission metrics and provide a solution that is definitely worth evaluating.
About the authorV.J. Jain is an Oracle Database and Applications Consultant and owner of Varun Jain, Inc. - Oracle Consulting. With over 12 years of experience with database systems, he specializes in database performance, development, interfaces, and high performance solutions. Based in Orange County, California, he actively explores Oracle's newest technologies and is a member of the Oracle Partner Network and Beta program.
|
Owned and Operated by Varun Jain, Inc, an Oracle Consulting Firm
Copyright ©2007 Oracle-Developer.com