W3C

RDF Model Theory

W3C Working Draft 28 August 2001

This version:
http://www.w3.org/TR/2001/WD-rdf-modeltheory-20010828
Latest version:
http://www.w3.org/TR/rdf-modeltheory
Authors:
Pat Hayes, IHMC

Abstract

@@Pat: Please write

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.

@@Pat: please write a paragraph

This document is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current public W3C Working Drafts can be found as part of the W3C Technical Reports and Publications.

This document has been produced by the RDF Core Working Group, part of the W3C Semantic Web Activity's standards development.

Please send comments on this document to www-rdf-comments@w3.org, the public forum for discussion of the W3C's work on RDF (public archive).

Table of Contents

@@generate

0. N-Triples and RDF Graphs

In what follows, 'the M&S' refers to the document Resource Description Framework (RDF) Model and Syntax Specification", 'RDFS spec' refers to the document Resource Description Framework (RDF) Schema Specification 1.0, and 'N-Triples' refers to the N-Triples specification revision 1.8, relevant parts of which are:

ntripleDoc ::= line *
line ::= ws * ( comment | triple ) ? eoln
triple ::= subject ws + predicate ws + object ws * '.' ws *
subject ::= uriref | namedNode
predicate ::= uriref
object ::= uriref | namedNode | literal

We understand N-Triples and RDF-XML as lexical notations for describing an RDF graph as defined in the M&S. An RDF graph is a partially labeled directed graph in which nodes are either anonymous (ie unlabelled), or else labeled with URIs or qLiterals; arcs are labeled with URIs; and distinct labelled nodes have different labels. The model theory assigns interpretations directly to the graph, which is taken as being the primary RDF syntax, in the sense that two RDF documents, in whatever lexical form, are syntactically equivalent if and only if they map to the same RDF graph. We will refer to this as the 'graph syntax' to avoid ambiguity, since the bare term 'syntax' is often assumed to refer to a lexicalization.

A graph which is like an RDF graph except that it has two or more nodes with the same label, will be called an untidy graph. Untidy graphs may arise as a result of merging operations or be intermediate states in various operations on graphs, but we will assume that all RDF graphs are tidy unless otherwise noted. In graph syntax, an untidy graph is considered to be syntactically ill-formed.

An RDF graph will be said to be ground if it contains no anonymous nodes. The vocabulary of a graph is the set of URIs that it contains.

A graph can be viewed as a set of triples corresponding to its node-arc-node edges; the correspondence between an ntripleDoc and a tidy RDF graph is that the graph has one node for each uriref, anonNode or qLiteral in the document, and one edge for each Ntriple triple, directed from the node labeled with the subject to the node labeled with the object and with the arc labeled with the predicate. Notice that this requires that all occurrences of a particular uriref or anonNode label in an N-Triples document be mapped to a single node in the corresponding graph.

An <anonNode> expression in an ntripleDoc serves to identify a particular anonymous node, but it is not considered to be a label of that node. (These terms play a role in an ntripleDoc analogous to that played by bound variables in a logical expression. They are local to the document and serve only to indicate a 'connection' between other expressions.)

Should probably say something about XML-RDF at this point.

Other RDF lexicalisations may use other means of indicating the graph structure; for our purposes, the important syntactic properties of RDF graphs is that distinct nodes in an RDF graph are treated as distinct referring entities in the graph syntax.

1. Interpretations

We assume that there is no restriction on what a resource can be and that there is no restriction on the domains and ranges of properties; in particular, a property may be applied to itself. (When classes are introduced in RDFS, we will assume that they can contain themselves. [Technical note. This assumption will not violate set-theoretic well-foundedness.])

The fact that two sets are given distinct labels does not imply that they are disjoint; we will explicitly state any disjointness or containment conditions as they arise. In the same spirit, the fact that one set is stated to be a subset of another should not be interpreted as saying that these sets cannot be identical, unless this is stated explicitly.

First, we assume a global set LV of literal values and a fixed mapping XL from the set of qLiterals to LV. All interpretations will be required to conform to XL on Literal nodes.

This leaves open the question of the exact nature of literals, their language-sensitivity and so on, while acknowledging their special status as expressions with a 'fixed' interpretation.

All interpretations will be relative to a set of URIs, called the vocabulary of the interpretation, so that one has to speak, strictly, of an interpretation of an RDF 'nodespace' or of an RDF graph, rather than of RDF itself. For a lexicalized version of the language, we can think of the vocabulary of an interpretation, more traditionally, to be a subset of the URI-indicating expressions used by that lexicalization; eg for N-Triples, we can consider the vocabulary of I to be a set of <urirefs>.

We do not take any position here on the way that node labels may be composed from other expressions, eg from relative URIs or Qnames; the model theory simply assumes that such lexical issues have been resolved in some way that is globally coherent, so that a single URI can be taken to have the same meaning wherever it occurs. Similarly, the model theory given here has no special provision for tracking temporal changes; it assumes, implicitly, that URIs have the same meaning whenever they occur. (If one wishes to apply RDF in a context where the referents of URIs (or names more generally) may change with time, then the current theory could be regarded as defining a 'snapshot' of the meaning of a changing representation.)

An interpretation I (of vocab(I)) is defined by:

1. A nonempty set IR of resources, called the domain or universe of I.

2. A subset IP of IR corresponding to properties

3. A mapping IEXT from IP into the powerset of IRx(IR union LV) (ie the set of sets of pairs <x,y> with x in IR and y in IR or LV)

4. A mapping IS: vocab(I) -> IR

IEXT(x) is a set of pairs, ie a binary relational extension, called the extension of x. This trick of distinguishing a relation as an object from its relational extension allows a property to occur in its own extension, a neat trick I learned from Chris Menzel.

The denotation I(E) of a ground RDF graph E in I is given by the following rules:

>> if E is a <qLiteral> then I(E) = XL(E)

>> if E is a <uriref> then I(E) = IS(E)

>> if E is an asserted triple with the form <s p o >

     then I(E) = true iff <Is),I(o)> is in  IEXT(I(p)), otherwise I(E)= false.

>> if E is a ground RDF graph then I(E) = false just in case I(E') = false for some asserted triple E' in E, otherwise I(E) = true.

The use of the phrase "asserted triple" is a deliberate weasel-worded artifact, to allow an RDF graph/document to contain triples which are being used for some non-assertional purpose (such as expressing a query, implementing part of some expression of another language, or forming part of a pattern intended to be matched against some asserted RDF in another document, or just being breezy:-). Strict conformity to the RDF 1.0 M&S assumes that all triples in a document are asserted triples, but making the distinction allows RDF parsers and inference engines to conform to the RDF syntax and to respect the RDF model theory without necessarily being fully committed to it. This also allows the model theory to be extended in future directions.

For example, for the vocabulary {a b c} the following is a small interpretation, where we use integers to indicate the 'things' in the universe:

IR= {1, 2}; IP = {1}

IEXT: 1->{<1,2>,<2,1>}

IS: a->1,b->1,c->2

which makes these triples true:

a b c

c a a

c b a

and these triples false:

a c b

a b b

c a c

For example, I(<a b c>) = true if <I(a),I(c)> is in IEXT(I(b)), ie if <1,2> is in IEXT(1), which is {<1,2>,<2,1>} and so does contain <1,2> and so I(<a bc>) is true.

I(<a c b>) = true if <I(a),I(c)>, ie <1,2> is in IEXT(I(c)); but I(c)=2 and IEXT is not defined on 2, so the condition fails and I(<a c b>) = false.

2. Anonymous nodes as existential assertions

We could treat anonymous nodes exactly like URIs, semantically speaking, by extending the IS mapping to include them as well as URIs. That would amount to adopting the view that an anonymous node is equivalent to a node with an unknown label. However, it seems to be more in conformance with the M&S to treat anonymous nodes as existentially bound variables. This will require some definitions, as the theory so far provides no meaning for anonymous nodes.

Suppose I is an interpretation and A is a mapping from some set of anonymous nodes to the domain of I, and define I+A to be an extended interpretation which is like I except that it uses A to give the interpretation of anonymous nodes. Define anon(E) to be the set of anonymous nodes in E. Then we can extend the above rules to include the two new cases that are introduced when anonymous nodes occur in the graph:

>> If E is an anonymous node than [I+A](E) = A(E)

>> If E is an RDF graph then I(E) = true if [I+B](E) = true for some B defined on anon(E), otherwise I(E)= false.

This effectively treats all anonymous nodes as existentially quantified in the RDF graph in which they occur. Notice that since two nodes cannot have the same label, there is no need to specify the 'scope' of the quantifier within a graph. (However, it is local to the graph.) If we were to apply the semantics directly to N-Triples syntax, we would need to indicate the quantifier scope, since in this lexicalisation syntax the same anonNode identifier may occur several times. The above rule amounts to the N-triple convention that would place the quantifiers just outside, ie at the outer edge of, the <ntripleDoc>ument corresponding to the graph.

Notice that we have not changed the definition of an interpretation. The same interpretation that provides a truth-value for ground graphs also assigns truth-values to graphs with anonymous nodes, even though it provides no interpretation for the anonymous nodes themselves. Notice also that the anonymous nodes themselves are, qua nodes, perfectly well-defined entities with a robust notion of identity; they differ from other nodes only in not being assigned a direct model-theoretic interpretation, which means that they have no 'global' meaning (ie outside the graph in which they occur).

2a. Comparison with formal logic

( Readers unfamiliar with formal logic can skip this section. )

With this semantics, an RDF graph can be translated directly into a logical expression with the same meaning.

Each node labeled with a URI translates to a logical constant which is its label ; each anonymous node to a distinct variable. Each arc (triple) labeled with <s p o> maps to an atomic assertion of the form p(s, o) (or (p s o) in KIF syntax); a graph translates to the the existential closure of the conjunction of all the arcs in the graph.

For example, the graph defined by the triples

_:x a b

c b _:x

_:x a c

translates to the logical expression (written in KIF syntax)

(exists (?y)(and (a ?y b)(b c ?y)(a c ?y)))

Notice that the resulting expression may contain the same symbol in both relation and object positions (eg 'b' in this example). This is considered syntactically illegal in many versions of logic, but is acceptable in ISO-KIF (Hayes & Menzel 2001). To map to a more conventional logical syntax, translate the triple <s p o> into the form

(holds p s o)

where 'holds' is a three-place relation used to assert that a binary relation holds between two arguments. The above example would then map to the expression:

(exists (?y)(and (holds a ?y b)(holds b c ?y)(holds a c ?y)))

3. RDFS interpretations

RDF Schema (RDFS) introduces a special RDF vocabulary, with some fixed constraints on its interpretation; in particular, the notion of a 'class'. We can define all of RDFS in terms of a single RDF property rdf:type which relates entities to the classes which contain them. Following conventional usage, we understand a classification into a 'type' to mean membership in a set; the sets will be considered to be the extensions of resources called 'classes', using the same technique used earlier to relate properties to their extensions. RDFS provides names for several of the classes whose extensions we have already used in defining the model theory (IR, LV, IP) and a general naming scheme to allow user-defined classes. The one new class is the class of all classes, whose extension - the set of all classes - will be called IC. It will be convenient to define a mapping ICEXT (for the Class Extension in I) from classes to their extensions, in terms of the relational extension of rdf:type, as follows:

ICEXT(x) = {y | <y,x> is in IEXT(I(rdf:type)) }

Ie the class extension of x is the set of things that are related to x by the property rdf:type, ie which have the class x as their 'type'. This definition is re-stated below as the semantic rule for 'rdf:type'.

An rdfs interpretation is then an RDF interpretation plus one extra subset of the universe, forming a fifth condition on an interpretation:

5. A subset IC of IR, containing classes

[Technical comment. This use of a class extension mapping allows classes to contain themselves. For example, it is quite OK for a 'universal' class to contain itself as a member, a convention that is often adopted at the top of a classification hierarchy. This convention is often thought unacceptable since it seems to violate one of the axioms of standard (Zermelo-Fraenkel) set theory, the axiom of foundation, which forbids infinitely descending chains of membership. However, this impression is mistaken. The semantic model given here distinguishes between the class as an object and its class extension - the set of things that are 'in' the class - thereby allowing the extension of a class to contain the class itself without violating the axiom of foundation. (If an extension contained itself then the axiom would be violated, but that case never arises.) In particular, the class of all classes - here, I(rdfs:Class) - may be in the set of all classes - ICEXT(I(rdfs:Class)) - which would make the triple <rdfs:Class rdf:type rdfs:Class> true in the interpretation I.

Notice that the question of whether or not a class contains itself as a member is quite different from the question of whether or not it is a subclass of itself. ]

An rdfs interpretation I satisfies the following conditions, in addition to those for an rdf interpretation given earlier. (The complete model theory is summarized in a table in an appendix.)

>> ICEXT(I(rdfs:Resource)) = IR

>> ICEXT(I(rdf:Property)) = IP

>> ICEXT(I(rdfs:Class)) = IC

>> ICEXT(I(rdfs:Literal)) = LV

>> <x,y> is in IEXT(I(rdf:type)) iff x is in ICEXT(y)

>> <x,y> is in IEXT(I(rdfs:subClassOf)) iff ICEXT(x) is a proper subset of ICEXT(y)

>> <x,y> is in IEXT(rdfs:subPropertyOf)) iff IEXT(x) is a proper subset of IEXT(y)

The subsets referred to in subPropertyOf are sets of pairs.

The M&S seems to imply that subClass and subProperty extensions are proper subsets, ie the identity case is disallowed, and hence 'subclass loops' are illegal. If we decide to change this to allow 'subclass loops', as assumed by DAML+OIL, then "proper" should be deleted from these conditions.

>> I(rdfs:ConstraintResource) is in IC

>> ICEXT(I(rdfs:ConstraintProperty)) is the intersection of ICEXT(I(rdfs:ConstraintResource)) and IP

>> I(rdf:range) and I(rdf:domain) are in ICEXT(I(rdfs:ConstraintProperty))

>> If <x,y> is in IEXT(I(rdf:range)) and <u,v> is in IEXT(x) then v is in ICEXT(y)

>> If <x,y> is in IEXT(I(rdf:domain)) and <u,v> is in IEXT(x) then u is in ICEXT(y)

This reflects our current understanding of multiple domain and range restrictions rather than the wording in the M&S.

(The model theory imposes no conditions on the interpretations of rdfs:seeAlso, rdfs:isDefinedBy, rdfs:comment or rdfs:label.)

4. Reification

RDF uses rdfs to define two important extensions to the basic language: reification (which allows RDF to make assertions about RDF statements) and the introduction of classes of containers. These impose extra conditions on interpretations, requiring their domains to include particular classes of structures.

We will say that an interpretation I reifies a set of nodes V if it is capable of describing all RDF triples that can be constructed from those nodes. (The RDF M&S does not provide any vocabulary for describing graphs.) When considering lexical forms of the language, it would be natural to think of V as a set of lexical labels which identify nodes, eg a subset of( <uriref> union <qLiteral> union <anonNode>) in N-Triples. For lexicalisations such as XML-RDF which do not provide lexical identifiers corresponding to all anonymous nodes, V must be allowed to contain arbitrary numbers of 'blank' node tokens as well as URIs and literals.)

In order to be reified, the interpretation must contain in its universe entities corresponding to all the nodes (or names) in V and all the triples that can be constructed from them. The reification conditions can then be stated as follows:

>> (V union VxVxV ) is a subset of IR

>> <x,y> is in IEXT(I(rdf:subject)) iff for some a, b, and c in V, x=<a b c>and y= a

>> <x,y> is in IEXT(I(rdf:predicate)) iff for some a, b and c in V, x=<a b c>and y= b

>> <x,y> is in IEXT(I(rdf:object)) iff for some a, b and c in V, x=<a b c>and y= c

>> x is in ICEXT(I(rdf:Statement)) iff for some a, b and c in V, x=<a b c>

Reification is fairly simple, in this account. However, notice that having x (or REIF(x) ) in the domain does not in itself provide any access to the meaning of x in the interpretation; in particular, it provides no way to assert x. This is illustrated by fact that none of the above equations mention I(x) when x is in IR. The language provides no way to *use* (as opposed to mention) a reified triple. Adding such an ability (sometimes called 'downward reflection' or 'dequoting') would considerably increase the expressive power of RDF; it would in effect provide the language with a form of truth predicate. On the other hand, such devices need to be treated with considerable care in order to avoid self-referential paradoxes.

5. Containers

RDFS provides a general class of 'containers' and three particular kinds of container: bags, sequences and 'alternatives'. Alternatives require special treatment, and will be described in the next section. (We note that the 'distributive reference' mechanisms which use rdf:aboutEach, described in M&S section 3.3, are part of the XML-RDF serialization rather than the RDF graph syntax, so are not considered here. It would be relatively easy to include them, however.)

Bags and sequences are very similar except that bags are considered to be unordered. The natural way to describe this mathematically is to think of bags as sequences with an equivalence relation defined on them, and identify a bag with an equivalence class of sequences. This amounts to saying that bag identity can be checked by allowing permutations, but otherwise a bag is much like a sequence: in particular, the members of a particular bag do in fact occur in a particular order in the bag, even though that order is not considered germane to questions of bag identity. RDFS provides essentially the same machinery for accessing the elements of bags and of sequences, for example. For all these reasons, we will take an abstract notion of sequence, called here a series, to be the 'basic' container type for RDF/S. A series (over S) is defined to be a partial function from the integers to S. If s is a series then s(n) is the n-th element of s; if s(n) is undefined then we will write s(n) = ??. ( '??' can be taken to be a value, but it must be a special value which is not in the universe IR of any interpretation.)

We will assume that there is a distinguished class of things called containers with subclasses called bags and sequences (and alternatives), and that containers are series on IR. Again, this allows such apparent oddities as containers which contain themselves, or two containers each of which contains the other. The set of containers-series will be called IS.

There is a special class of container membership properties denoted by 'rdf:_1', 'rdf:_2', etc., each of which selects a member from a position in a container. In a familiar abuse of notation, we will write 'rdf:_n' to refer to these, where 'n' is a parameter ranging over the natural numbers/numerals. In the graph syntax, membership of something in a container is asserted by a triple which uses a container membership property.

>> ICEXT(I(rdfs:Container)) = IS a subset of IR

>> ICEXT(I(rdf:Bag)) = IBAG, a subset of IS

>> ICEXT(I(rdf:Seq)) = ISEQ, a subset of IS disjoint from IBAG

>> ICEXT(I(rdfs:ContainerMembershipProperty)) = CMIP, a subset of IP

>> I(rdf:_n) is in CMIP

>> <x,y> is in IEXT(I(rdf:_n)) iff x is in IS and x(n)=y

The last condition - really a countable infinity of conditions - is the heart of the 'container' notion, which can be regarded as to do with the relational extension of container membership properties in much the same way that rdfs:Class has to do with the relational extension of rdf:type.

This model theory makes no further assumptions about bags and sequences. Although the model theory does not need to assume this, it seems natural to assume that the series corresponding to RDF sequences are finite, ie that there is some integer length(s) such that for any n > length(s), s(n) = ??. What is less clear is whether they should be allowed to have 'gaps', ie whether s(n) = ?? for any n <= length(s). The M&S is not entirely clear on this question, so it is left open in the model theory. (The issue surfaces when an RDF graph gives only partial information about a container, e.g. consider whether or not the RDF graph

xxx rdf:type rdf:Bag

xxx rdf:_1 aaa

xxx rdf:_3 ccc

should entail

xxx rdf:_2 _:something

This issue arises very sharply when considering alternative containers, as we will see.)

The model theory as written also provides no formal account of what makes a bag different from a sequence, other than insisting that they are different. One way to add such an account would be to impose a restriction on the subset ICEXT(I(rdf:Bag)), so that it is not permitted to contain two series which are permutations of each other. This restriction would not alter the truth-values of any piece of RDF syntax, however, so it is omitted from the formal model theory.

The RDF M&S provides a number of examples of 'intended uses' for containers, particularly bags, which seem to involve extra assumptions about the meanings of expressions involving bags; for example, the observation that a triple whose first element is a bag could be considered to be as assertion about the bag, or about the elements of the bag, depending on the property being applied. As stated here, the model theory only supports interpretations in which containers are 'opaque' objects, so that assertions involving containers are about those containers, rather than being understood 'transparently' to be asserting anything about the members of the containers. The model theory does not prohibit transparent interpretations of bags and sequences, but such interpretations would require extra semantic conditions which are not stated here. Treating containers as 'transparent' has quite extensive consequences, requiring the notions of satisfiability and entailment to be re-considered. Both these points are illustrated in the next section.

6. Alternatives

Alternatives are transparent containers which have a special disjunctive intepretation. When a property is asserted of an alternative, it means that the property holds of one or more of the elements of the alternative. In effect, such an assertion is a disjunction of the assertions which would be got by asserting that the property holds of each of the elements seperately.

This is the only way of encoding a disjunction in RDF, and complicates the theory, because a disjunction becomes weaker as extra conditions are added to it, so the 'existential/conjunctive' flavor of the rest of RDF, where adding more triples to a graph always increases the information content of the graph, is now changed.

We will treat alternatives as a class of container, following the M&S, and will add the disjunctive interpretation as an extra semantic condition.

>> ICEXT(I(rdf:Alt)) = a subset IALT of IS disjoint from

ICEXT(I(rdf:Bag)) union ICEXT(I(rdf:Seq))

>> if x is in IALT then for any z in IP, <x y> is in IEXT(z) iff for some n, <x(n) y> is in IEXT(z)

This last condition is different in kind from the rest of the model theory, and illustrates the kind of machinery which is required to give a 'transparent' reading to a container.

6.1 Alterative problems: discussion

This semantics refers the meaning of a property ascription to an alternative container to the members of the container as prescribed by the M&S. However, it does not address the issue of how exactly the members of the container are specified.

If we assume that an RDF graph specifies membership by the presence of triples of the form <[alt], rdf:_n, [thing]>, then unless there is some way to assert that a certain set of such triples exhausts the membership of an alternative, it is impossible to really know what an RDF graph containing an alternative actually asserts, since the container may contain a member which is not mentioned in the graph, and this 'unknown' member may make any property true of the alternative even if the property is false of all the members which are mentioned in the graph.

For example, consider a graph defined by the following triples:

xxx type rdf:Alt

xxx rdf:_1 aaa

xxx rdf:_2 bbb

xxx foo baz

With the current model theory, this says that the value of foo is baz for some thing in the Alt container, which might be aaa, bbb or something else not mentioned in the graph. It does not, therefore, follow that one of <aaa foo baz> or <bbb foo baz> is true; they might both be false, and still <xxx foo baz> could be true; so asserting this property of the alternative container basically says nothing at all about the value of the property applied to the members of the container mentioned in the graph itself.

There are several possible ways around this problem, but they all require some change to RDF.

We could impose a 'closed-world' condition on membership in an alternative, but that would make the entire language non-monotonic.If we impose a closed-world assumption on alternatives, by assuming that their members are precisely the ones mentioned in the graph and no others, then this graph implicitly assumes that there are no other members, so the example given above asserts that either aaa foo baz or bbb foo baz, as one might expect it to; however, if one were to then add the triple

xxx rdf:_3 ccc

to the graph, then the result would not be entailed by the original graph, since it would then assert something weaker, viz. that either <aaa foo baz>, <bbb foo baz>, or <ccc foo baz>. Since adding a triple to a graph is thought of as preserving the meaning of the triples already present, this makes the language non-monotonic in the sense that adding information to a graph (in this case, information about the membership of a container) renders some of the existing information invalid, so that any conclusion drawn from the graph cannot safely be considered to be entailed by the new, expanded, graph.

Another possible way around the problem would be to insist that alternatives are not containers in the usual sense, but are in fact disjunctions, and think of them as syntactic extensions to the RDF language itself. On this view, the addition of the fifth triple to the above graph would be a change to the logical syntax. However, this does not seem to be in the spirit of the original RDF design, and would require the entire model theory to be re-stated.

A rather less drastic change would be to introduce some way to explicitly assert that a container has a certain number of members, so that the appropriate way to express the intended meaning of the example above would be something like this:

xxx type rdf:Alt

xxx rdf:_1 aaa

xxx rdf:_2 bbb

xxx rdf:size 2

xxx foo baz

where the 'rdf:size' property indicates that the container xxx has precisely two elements, so that any mention of a third element of xxx would produce an inconsistency.

Since none of these approaches is currently use in RDF, and since the only plausible interpretation of rdf:Alt makes any assertion about an alternative essentially vacuous, we offer this semantics for rdf:Alt only to illustrate the problem, not as a recommendation.

Appendix 1: Summary of model theory

RDF

An interpretation I (of vocab(I)) is defined by:

1. A nonempty set IR of resources, called the domain or universe of I.

2. A subset IP of IR corresponding to properties

3. A mapping IEXT from IP into the powerset of IRx(IR union LV)

4. A mapping IS: vocab(I) -> IR

Suppose A is a mapping from some set of anonNodes to the domain of I. anon(E) is the set of anonymous nodes in the graph E.

>> if E is a <qLiteral> then I(E) = XL(E)

>> if E is a <uriref> then I(E) = IS(E)

>> if E is an asserted triple <s p o >

     then I(E) = true iff <I(s),I(o)> is in IEXT(I(p)), otherwise I(E)= false.

>> if E is a ground RDF graph then I(E) = false just in case I(E') = false for some asserted triple E' in E, otherwise I(E) = true.

>> if E is an anonymous node than [I+A](E) = A(E)

>> if E is an RDF graph then I(E) = true if [I+B](E) = true for some B defined on anon(E), otherwise I(E)= false.

RDFS

5. A subset IC of IR, containing classes

6. A mapping ICEXT from IC to the powerset of (IR union LV)

>> ICEXT(I(rdfs:Resource)) = IR

>> ICEXT(I(rdf:Property)) = IP

>> ICEXT(I(rdfs:Class)) = IC

>> ICEXT(I(rdfs:Literal)) = LV

>> <x,y> is in IEXT(I(rdf:type)) iff x is in ICEXT(y)

>> <x,y> is in IEXT(I(rdfs:subClassOf)) iff ICEXT(x) is a proper subset of ICEXT(y)

>> <x,y> is in IEXT(rdfs:subPropertyOf)) iff IEXT(x) is a proper subset of IEXT(y)

>> I(rdfs:ConstraintResource) is in IC

>> ICEXT(I(rdfs:ConstraintProperty)) is the intersection of IP and ICEXT(I(rdfs:ConstraintResource))

>> I(rdf:range) and I(rdf:domain) are in ICEXT(I(rdfs:ConstraintProperty))

>> If <x,y> is in IEXT(I(rdf:range)) and <u,v> is in IEXT(x) then v is in ICEXT(y)

>> If <x,y> is in IEXT(I(rdf:domain)) and <u,v> is in IEXT(x) then u is in ICEXT(y)

REIFICATION

For some vocabulary V, (V union VxVxV ) is a subset of IR

>> <x,y> is in IEXT(I(rdf:subject)) iff for some a, b, and c in V, x=<a b c> and y= a

>> <x,y> is in IEXT(I(rdf:predicate)) iff for some a, b and c in V, x=<a b c> and y= b

>> <x,y> is in IEXT(I(rdf:object)) iff for some a, b and c in V, x=<a b c> and y= c

>> x is in ICEXT(I(rdf:Statement)) iff for some a, b and c in V, x=<a b c>

BAGS AND SEQUENCES

IS is a set of series over IR. IS is a subset of IR.

>> ICEXT(I(rdfs:Container)) = IS

>> ICEXT(I(rdf:Bag)) = a subset of IS disjoint from ICEXT(I(rdf:Seq))

>> ICEXT(I(rdf:Seq)) = a subset of IS disjoint from ICEXT(I(rdf:Bag))

>> ICEXT(I(rdfs:ContainerMembershipProperty)) = CMIP, a subset of IP

>> I(rdf:_n) is in CMIP

>> <x,y> is in IEXT(I(rdf:_n)) iff x is in IS and x(n)=y

ALTERNATIVES

>> ICEXT(I(rdf:Alt)) = a subset IALT of IS disjoint from

ICEXT(I(rdf:Bag)) union ICEXT(I(rdf:Seq))

>> if x is in IALT then for any z in IP and y in (IR union LV), <x, y> is in IEXT(z) iff for some n, <x(n), y> is in IEXT(z)

RDF/RDFS model theory summary

0. Domains and mappings of interpretation I

vocab(I) set of URIs ; LV set of literal values (global) ; IR set of resources (universe);

IP subset of IR (properties) ; IC: subset of IR (classes).

XL: literals -> LV; IS: vocab(I) -> IR; IEXT: IP -> subsets of (IR x (IR union LV));

ICEXT: IC -> subsets ofIR

1. Basic equations

E is

I(E) is

A literal (qLiteral)

XL(E)

A uri (uriref)

IS(E)

An asserted triple <s p o>

true if <I(s), I(o)> is in IEXT(I(p)), otherwise false

Any other triple

not defined

A ground RDF graph

false if I(E') =false for any asserted triple E' in E,

otherwise true

An anonymous node (anonNode)

not defined ; but [I+A](E) = A(E)

An RDF graph

true if [I+A](E) = true for some A: anon(E) -> IR,

otherwise false.

2. Class extensions

E is

I(E) is in IC; ICEXT(I(E)) is

rdfs:Resource

IR     (The universe of the interpretation)

rdf:Property

IP     (Properties; subset of IR, domain of IEXT)

rdfs:Class

IC     (Classes; subset of IR, domain of ICEXT)

rdfs:Literal

LV    (Literal values)

rdfs:ConstraintResource

CR    (subset of IR)

rdfs:ConstraintProperty

CP = IP intersect CR

3. Property extensions

E is

I(E) is in IP; <x,y> is in IEXT(I(E)) iff

rdf:type

x is in ICEXT(y)

rdfs:subClassOf

ICEXT(x) is a proper subset of ICEXT(y)

rdfs:subPropertyOf

IEXT(x) is a proper subset of IEXT(y)

E is

I(E) is in CP; <x,y> is in IEXT(I(E)) iff

rdfs:range

if <u,v> is in IEXT(x) then v is in ICEXT(y)

rdfs:domain

if <u,v> is in IEXT(x) then u is in ICEXT(y)

4. Reification

V is a set of nodes (URIs and/or anonymous nodes and/or literals)

(V union (VxVxV) ) is a subset of IR

E is

I(E) is in IC; x is in ICEXT(I(E)) iff

rdf:Statement

x= <a b c> where a, b, c are in V

E is

I(E) is in IP; <x,y> is in IEXT(I(E)) iff

rdf:subject

x= <a b c> and y=a where a, b, c are in V

rdf:predicate

x= <a b c> and y=b where a, b, c are in V

rdf:object

x= <a b c> and y=c where a, b, c are in V

5. Containers (Bags and sequences)

IS is a set of series over IR

IS is a subset of IR

E is

I(E) is in IC; ICEXT(I(E)) is

rdfs:Container

IS

rdf:Bag

a subset IBAG of IS

rdf:Seq

a subset ISEQ of IS disjoint from IBAG

rdfs:ContainerMembershipProperty

CMIP, a subset of IP

E is

I(E) is in CMIP; <x,y> is in IEXT(I(E)) iff

rdf:_n

x is in IS and x(n) =y

5a. Containers (Alternatives)

E is

I(E) is in IC; ICEXT(I(E)) is

rdf:Alt

a subset IALT of IS disjoint from ISEQ and IBAG

If x is in IALT then for any z in IP and y in (IR union LV), <x, y> is in IEXT(z) iff for some n, <x(n), y> is in IEXT(z)