Skip to content

Tabulator RDF Parser

aurimasv edited this page Jun 24, 2012 · 4 revisions

Sample RDF

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:s="http://example.org/packages/vocab#"> 
 
 <rdf:Description rdf:about="http://example.org/packages/X11">
    <s:DistributionSite>
       <rdf:Alt>
          <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"/> 
          <rdf:_2 rdf:resource="ftp://ftp.example.org"/>
          <rdf:_2 rdf:resource="ftp://ftp1.example.org"/>
          <rdf:_5 rdf:resource="ftp://ftp2.example.org"/>
      </rdf:Alt>
    </s:DistributionSite>
 </rdf:Description>
</rdf:RDF>

Node Types

Each node (except for Empty) defines:

toString, toNT, hashString methods for serialization. All 3 are very similar, but not always the same.

classOrder is a numeric value used for comparison. Each node has a unique value.

termType corresponds to the name of the node type.

sameTerm returns boolean if the terms are the same

compareTerm compares the node to the same type of node. returns -1, +1, 0 similarly to String.localeCompare.

Empty

termType = "empty"

Serializes to ()

Nothing else defined.

(?This is not actually used??)

Symbol(uri)

Holds a uri corresponding to a predicate.

termType = "symbol"

Serializes to <uri>

sameTerm and compareTerm compare by uri

BlankNode(id)

termType = "bnode"

id holds a unique integer id for each blank node

value = id

Serializes to "_n" + id

sameTerm and compareTerm compare by id

copy: returns a copy of the node

Literal(value, lang, datatype)

termType = "literal"

value, lang, datatype from constructor

toString = value.toString()

toNT = "value"^^datatype@lang //after escaping backslashes, newlines, and double-quotes in value

hashString = toNT

sameTerm compares by value, lang, and datatype

compareTerm compares by value

copy: returns a copy of the node

datatype = from constructor

Collection

termType = "collection"

id = unique integer id, same as for BlankNode. Both are considered for uniqueness

elements holds child elements

toString = '(' + elements.join(' ') + ')'

toNT = "_n" + id

hashString = toNT

append and unshift elements to the end and front of the elements array respectively

shift = elements.shift()

closed = false

close: closed = true

sameTerm and compareTerm compare by id

Statement

Represents an RDF statement

constructor(subject, predicate, object, why) where why refers to provenance or inference. subject, predicate, object are refered to as s, p, o respectively. s, p, o are constructed into nodes via term (see accessory functions)

toNT = s.toNT() + ' ' + p.toNT() + ' ' + o.toNT()

toString = toNT

Variable(rel)

Variables are placeholders used in patterns to be matched.

termType = "variable"

base = "varid:"

uri = base + rel //unless rel is an absolute URI, then uri = rel

toNT = '?' + rel

hashString = toString = toNT

sameTerm compares by uri

The Data Store

All RDF statements are stored in a "data store" or just "store". It is represented in code by the IndexedFormula class

Formula

Super class for IndexedFormula

termType = "formula"

statements stores all the statements belonging to the formula

toNT = '{' + statements.join('\n') + '}'

hashString = toString = toNT

sameTerm compares by hashString

add pushes new statement to the statements array

fromNT converts a string in NT notation to RDF nodes

For the following functions, at least one argument should be undefined

each(s, p, o, w) Returns all values for the undefined variable for statements that match the remaining variables

any(s, p, o, w) Returns the value for the undefined argument for the first(??) term matching the other arguments

the(s, p, o, w) Equivalent to any, except logs an error if any returns undefined

holds(s, p, o, w) Returns true if there is a statement that matches the supplied arguments

whether(s, p, o, w) Returns the number of statements matching supplied arguments

IndexedFormula

Extends formula by adding indexing for statements by subject, object, predicate, and why

subjectIndex, predicateIndex, objectIndex, and whyIndex hold indices for the specified terms

index = [subjectIndex, predicateIndex, objectIndex, whyIndex]

redirections holds redirections from lexically bigger to smaller symbols. There should be only one level of redirections at all times. E.g. used for owl:SameAs nodes

aliases reverse of redirections

canon(uri) returns the smallest alias of the term represented by uri using redirections

anyStatementMatching(s,p,o,w) returns the first statement matching a given s,p,o,w set or undefined. Undefined terms are wildcards

statementsMatching(s,p,o,w,justOne) returns an array with statements matching arguments. Undefined terms are wildcards

Accessory funtions

##term(val) Converts val into a node object and returns that object

undefined returned as undefined

string converted to Literal

Numbers stored as Literal with either XSDfloat, XSDdecimal, or XSDinteger datatype

boolean stored as Literal with datatype = XSDboolean

Date converted to Literal with string value YYYY-MM-DDTHH:mm:ssZ

Array converted to Collection with each element of array processed recursively and added to the collection

Other objects are returned unchanged. Assumption is that they will be RDF node objects

NextId(request)

returns the next available id for BlankNode or Collection. request is returned is available

Shortcut functions

##constructor functions Return new objects

sym -> Symbol

lit -> Literal

variable -> Variable

st -> Statement

graph -> IndexedFormula

st -> Statement