Bell Labs logo

Building the Semantic Web on XML


by Peter F. Patel-Schneider
Jerome Simeon

Abstract

The semantic discontinuity between World-Wide Web languages, e.g., XML, XML Schema, and XPath, and Semantic Web languages, e.g., RDF, RDFS, and DAML+OIL, forms a serious barrier for the stated goals of the Semantic Web. This discontinuity results from a difference in modeling foundations between XML and logics. We propose to eliminate that discontinuity by creating a common semantic foundation for both the World-Wide Web and the Semantic Web, taking ideas from both. The common foundation results in essentially no change to XML, and only minor changes to RDF. But it allows the Semantic Web to get closer to its goal of describing the semantics of the World Wide Web. Other Semantic Web languages (including RDFS and DAML+OIL) are considerably changed because of this common foundation.

Semantic Web Vision

  1. Bring structure to web pages
  2. Permit software agents to carry out sophisticated tasks for users
  3. Extension of the current web

(Tim Berners-Lee, James Hendler, and Ora Lassila. ``The Semantic Web''. Scientific American, May 2001.)

Requirements for Semantic Web Languages

Form: The languages used in the semantic web need well-defined syntax.

Meaning: The languages used in the semantic web need well-defined semantics.

Semantic Web Tower

Semantic Web Tower

Semantic Web Tower (from Tim Berners-Lee)

Elements of the Semantic Web Tower

The Current Vision of the Semantic Web Tower

Rationale for the Current Vision

Problems with the Semantic Web Vision

  1. Disconnects at the Foundation
    • The XML meaning is not used, so data written in XML cannot be used in the Semantic Web.
    • XML Schema is not used in the Semantic Web languages.
  2. An Inadequate Basis
    • RDF is inadequate for providing either syntax or semantics for the entire Semantic Web.

A Disconnect at the Foundation

A Disconnect at the Foundation (po.xml extracts)

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <city>Mill Valley</city>
        <state>CA</state>
    </shipTo>
    <items>
        <item partNum="872-AA">
            <productName>Lawnmower</productName>
            <quantity>1</quantity>
            <USPrice>148.95</USPrice>
          </item>
	...
    </items>
</purchaseOrder>

A Disconnect at the Foundation (po.xsd extracts)

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

 <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

 <xsd:complexType name="PurchaseOrderType">
  <xsd:sequence>
   <xsd:element name="shipTo" type="USAddress"/>
   <xsd:element name="billTo" type="USAddress"/>
   <xsd:element ref="comment" minOccurs="0"/>
   <xsd:element name="items"  type="Items"/>
  </xsd:sequence>
  <xsd:attribute name="orderDate" type="xsd:date"/>
 </xsd:complexType>

 ...
</xsd:schema>  

RDF is not built on XML

Providing Meaning for Semantic Web Languages

Model-theoretic semantics is an excellent way of providing meaning.

RDF Model-Theoretic Semantics (heavily abstracted)

An RDF interpretation is a node- and edge-labelled graph

  1. labels are identifiers (not types)
  2. node labels are either URIs or strings
    • nodes with strings as labels have no outgoing edges
  3. edge labels are URIs
  4. there is no order in the graph

RDF Model-Theoretic Semantics (heavily abstracted)

An RDF interpretation is a model of an RDF document if there is

XML Model-Theoretic Semantics (abstracted)

An XML interpretation is a node-labelled tree

  1. node labels indicate typing information (not identification)
  2. node labels are either QNames or strings
    • nodes with QName labels are either element nodes or attribute nodes
    • nodes with strings as labels have no outgoing edges
  3. there is a total order on the outgoing edges of each node

XML Model-Theoretic Semantics (abstracted)

An XML interpretation is a model of an XML document if there is

Example RDF Interpretation

Example RDF Interpretation

Another Example RDF Interpretation

Example RDF Interpretation

Example XML Interpretation

Example XML Interpretation

A New Foundation for the Semantic Web

Integrated Model-Theoretic Semantics (abstracted)

An interpretation is a six-tuple,

  1. R, a set of resources
  2. E, a set of relationships
  3. EXT, a mapping from relationships to pairs of resources or pairs of resources and strings
  4. CEXT, a mapping from resources to sets of resources
  5. O, provides a strict partial order on relationships
  6. S, a mapping from URIs to resources

XML (and RDF) documents are processed into document graphs that are like XML document graphs with the addition of RDF identifiers.

Integrated Model-Theoretic Semantics (abstracted)

An RDF interpretation is a model of an document graph if there is a mapping N from the nodes of the graph to resources with

Example Interpretation

Example Interpretation

A New Foundation for the Semantic Web

Sources of Information

Status

A New Semantic Web Vision