SQL Foundations  «Prev  Next»

Lesson 11

Querying XML Database

In the coming modules, you will learn the specifics about
  1. inserting,
  2. querying, and
  3. working
with the information in your databases and tables. Right now, take a deep breath and realize that this is just the beginning. Think about how SQL can be used to manipulate data stored in relational databases.

XML Database

An XML database is a data persistence software system that allows data to be imported, accessed and exported in the XML format.
An XML Database, often referred to as a native XML (eXtensible Markup Language) database, is a type of database that allows data to be specified, and sometimes stored, in XML format. This enables data to be stored in a format that can be queried, exported, and serialized into the desired XML format. Here are the key characteristics of an XML database:
  1. Data Storage: Unlike relational databases, XML databases often store data as XML documents. This is particularly useful for applications that work extensively with XML since data doesn't need to be converted into another format, such as rows and columns, before being stored.
  2. Document-oriented: XML databases are typically document-oriented. This means they are designed to store, retrieve, and manage document-oriented information, also known as semi-structured data.
  3. Schema Evolution: XML databases offer excellent support for schema evolution. As business requirements change, the structure of the data can also evolve without requiring all existing data to be modified.
  4. Hierarchical Data Structure: XML databases naturally support hierarchical or nested data structures, which can be more intuitive and flexible than the flat table structure used in relational databases.
  5. XPath/XQuery Support: XML databases generally support XPath for locating and processing items in an XML document, and XQuery, an XML query language, for extracting and manipulating data.
  6. Integration and Interoperability: XML is a widely accepted standard for data exchange. XML databases capitalize on this by providing excellent support for data integration tasks and interoperability with other systems.
  7. ACID Compliance: Similar to relational databases, many XML databases support ACID (Atomicity, Consistency, Isolation, Durability) properties for reliable processing of data.
  8. Support for Full Text Search: XML databases typically offer strong support for advanced text search capabilities over the XML data.
  9. Scalability: XML databases are designed to handle large amounts of data and to scale well with data size.

Keep in mind that not all XML databases will necessarily have all these features, as different XML databases may focus on different subsets of these characteristics based on their specific use cases. As with any technology, when choosing a database, it's important to consider the needs of the specific application and the strengths and weaknesses of the available tools.

Two major classes of XML database exist:

  1. XML-enabled: These map all XML to a traditional database (such as a relational database), accepting XML as input and rendering XML as output.
  2. Native XML (NXD) The internal model of such databases depends on XML and uses XML documents as the fundamental unit of storage.
Note: "XML-enabled" implies that the database does the conversion itself (as opposed to relying on middleware).

XQuery, Linked Data, and the Semantic Web

If you are working with RDF, whether as RSS feeds or as Semantic Web Linked Data, XQuery has something to offer you.
You can fairly easily generate RSS and RDF/XML with XQuery, of course. The query language for RDF is called SPARQL, and there is even a SPARQL query processor written in XQuery that turns out to be faster than many native SPARQL engines. SPARQL engines can produce results in XML, and that too can be processed with XQuery. XML and RDF both have their separate uses, as do SPARQL and XQuery, and you can use them together when it makes sense.

XQuery

  1. XQuery is a W3C-standardized language for querying documents and data.
  2. XQuery operates on instances of the XPath and XQuery Data Model, so you can use XQuery to work with anything that can build a suitable model. This often includes relational databases and RDF triple stores
  3. Objects in the data model and objects and values created by XQuery expressions have types as defi ned by W3C XML Schema.
  4. XQuery and XSLT both build on XPath; XSLT is an XML-syntax language which includes XPath expressions inside attributes and XQuery uses XPath syntax extended with more keywords.
  5. There are XQuery processors (sometimes called XQuery engines) that work inside relational databases, accessing the underlying store directly rather than going through SQL.
  6. There are also XML-native databases, and some XQuery engines just read fi les from the hard drive, from memory, or over the network.
  7. XQuery Update is a separate specifi cation for making changes to data model instances.
  8. XPath and XQuery Full Text is a separate specifi cation for full-text searching of XML documents or other data model instances.
  9. XQuery Scripting is a separate specifi cation that adds procedural programming to XQuery, but it is currently not a fi nal specifi cation.
  10. The two most important building-blocks of XQuery are the FLWOR expression and functions.
  11. XQuery FLWOR stands for for-let-where-order by-return.
  12. User-defi ned functions can be recursive, and can be collected together along with userdefined read-only variables into separate library fi les called modules.