Skip navigation links

Package org.apache.juneau.jena

Jena-based RDF serialization and parsing support

See: Description

Package org.apache.juneau.jena Description

Jena-based RDF serialization and parsing support

Table of Contents
  1. RDF support overview

    1. Example

  2. RdfSerializer class

    1. Namespaces

    2. URI properties

    3. @Bean and @BeanProperty annotations

    4. Collections

    5. Root property

    6. Typed literals

    7. Non-tree models and recursion detection

    8. Configurable properties

    9. Other notes

  3. RdfParser class

    1. Parsing into generic POJO models

    2. Configurable properties

    3. Other notes

1 - RDF support overview

Juneau supports serializing and parsing arbitrary POJOs to and from the following RDF formats:

  • RDF/XML
  • Abbreviated RDF/XML
  • N-Triple
  • Turtle
  • N3

Juneau can serialize and parse instances of any of the following POJO types:

  • Java primitive objects (e.g. String, Integer, Boolean, Float).
  • Java collections framework objects (e.g. HashSet, TreeMap) containing anything on this list.
  • Multi-dimensional arrays of any type on this list.
  • Java Beans with properties of any type on this list.
  • Classes with standard transformations to and from Strings (e.g. classes containing toString(), fromString(), valueOf(), constructor(String)).

In addition to the types shown above, Juneau includes the ability to define 'swaps' to transform non-standard object and property types to serializable forms (e.g. to transform Calendars to and from ISO8601 strings, or byte[] arrays to and from base-64 encoded strings).
These can be associated with serializers/parsers, or can be associated with classes or bean properties through type and method annotations.

Refer to POJO Categories for a complete definition of supported POJOs.

Prerequisites

Juneau uses the Jena library for these formats.
The predefined serializers and parsers convert POJOs to and from RDF models and then uses Jena to convert them to and from the various RDF languages.

Jena libraries must be provided on the classpath separately if you plan on making use of the RDF support.

The minimum list of required jars are:

  • jena-core-2.7.1.jar
  • jena-iri-0.9.2.jar
  • log4j-1.2.16.jar
  • slf4j-api-1.6.4.jar
  • slf4j-log4j12-1.6.4.jar

1.1 - RDF support overview - example

The example shown here is from the Address Book resource located in the org.apache.juneau.sample.war application.

The POJO model consists of a List of Person beans, with each Person containing zero or more Address beans.

When you point a browser at /sample/addressBook, the POJO is rendered as HTML:

By appending ?Accept=mediaType&plainText=true to the URL, you can view the data in the various RDF supported formats.

RDF/XML
Abbreviated RDF/XML
N-Triple
Turtle
N3

2 - RdfSerializer class

The RdfSerializer class is the top-level class for all Jena-based serializers.
Language-specific serializers are defined as inner subclasses of the RdfSerializer class:

Static reusable instances of serializers are also provided with default settings:

Abbreviated RDF/XML is currently the most widely accepted and readable RDF syntax, so the examples shown here will use that format.

For brevity, the examples will use public fields instead of getters/setters to reduce the size of the examples.
In the real world, you'll typically want to use standard bean getters and setters.

To start off simple, we'll begin with the following simplified bean and build it up.

public class Person { // Bean properties public int id; public String name; // Bean constructor (needed by parser) public Person() {} // Normal constructor public Person(int id, String name) { this.id = id; this.name = name; } }

The following code shows how to convert this to abbreviated RDF/XML:

// Create a new serializer with readable output. RdfSerializer s = new RdfSerializerBuilder().xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3).build(); // Create our bean. Person p = new Person(1, "John Smith"); // Serialize the bean to RDF/XML. String rdfXml = s.serialize(p);

It should be noted that serializers can also be created by cloning existing serializers:

// Create a new serializer with readable output by cloning an existing serializer. RdfSerializer s = RdfSerializer.DEFAULT_XMLABBREV.builder() .property(RdfProperties.RDF_rdfxml_tab, 3).build();

This code produces the following output:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/"> <rdf:Description> <jp:id>1</jp:id> <jp:name>John Smith</jp:name> </rdf:Description> </rdf:RDF>

Notice that we've taken an arbitrary POJO and converted it to RDF.
The Juneau serializers and parsers are designed to work with arbitrary POJOs without requiring any annotations.
That being said, several annotations are provided to customize how POJOs are handled to produce usable RDF.

2.1 - Namespaces

You'll notice in the previous example that Juneau namespaces are used to represent bean property names.
These are used by default when namespaces are not explicitly specified.

The juneau namespace is used for generic names for objects that don't have namespaces associated with them.

The juneaubp namespace is used on bean properties that don't have namespaces associated with them.

The easiest way to specify namespaces is through annotations.
In this example, we're going to associate the prefix 'per' to our bean class and all properties of this class.
We do this by adding the following annotation to our class:

@Rdf(prefix="per") public class Person {

In general, the best approach is to define the namespace URIs at the package level using a package-info.java class, like so:

// RDF namespaces used in this package @RdfSchema( prefix="ab", rdfNs={ @RdfNs(prefix="ab", namespaceURI="http://www.apache.org/addressBook/"), @RdfNs(prefix="per", namespaceURI="http://www.apache.org/person/"), @RdfNs(prefix="addr", namespaceURI="http://www.apache.org/address/"), @RdfNs(prefix="mail", namespaceURI="http://www.apache.org/mail/") } ) package org.apache.juneau.sample.addressbook; import org.apache.juneau.xml.annotation.*;

This assigns a default prefix of "ab" for all classes and properties within the project, and specifies various other prefixes used within this project.

Now when we rerun the sample code, we'll get the following:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/"> <rdf:Description> <per:id>1</per:id> <per:name>John Smith</per:name> </rdf:Description> </rdf:RDF>

Namespace auto-detection (XmlSerializerContext.XML_autoDetectNamespaces) is enabled on serializers by default.
This causes the serializer to make a first-pass over the data structure to look for namespaces.
In high-performance environments, you may want to consider disabling auto-detection and providing an explicit list of namespaces to the serializer to avoid this scanning step.

// Create a new serializer, but manually specify the namespaces. RdfSerializer s = new RdfSerializerBuilder() .xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3) .autoDetectNamespaces(false) .namespaces("{per:'http://www.apache.org/person/'}") .build();

This code change will produce the same output as before, but will perform slightly better since it doesn't have to crawl the POJO tree before serializing the result.

2.2 - URI properties

Bean properties of type java.net.URI or java.net.URL have special meaning to the RDF serializer.
They are interpreted as resource identifiers.

In the following code, we're adding 2 new properties.
The first property is annotated with @BeanProperty to identify that this property is the resource identifier for this bean.
The second un-annotated property is interpreted as a reference to another resource.

public class Person { // Bean properties @Rdf(beanUri=true) public URI uri; public URI addressBookUri; ... // Normal constructor public Person(int id, String name, String uri, String addressBookUri) throws URISyntaxException { this.id = id; this.name = name; this.uri = new URI(uri); this.addressBookUri = new URI(addressBookUri); } }

We alter our code to pass in values for these new properties.

// Create our bean. Person p = new Person(1, "John Smith", "http://sample/addressBook/person/1", "http://sample/addressBook");

Now when we run the sample code, we get the following:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/"> <rdf:Description rdf:about="http://sample/addressBook/person/1"> <per:addressBookUri rdf:resource="http://sample/addressBook"/> <per:id>1</per:id> <per:name>John Smith</per:name> </rdf:Description> </rdf:RDF>

The URI annotation can also be used on classes and properties to identify them as URLs when they're not instances of java.net.URI or java.net.URL (not needed if @Rdf(beanUri=true) is already specified).

The following properties would have produced the same output as before. Note that the @URI annotation is only needed on the second property.

public class Person { // Bean properties @Rdf(beanUri=true) public String uri; @URI public String addressBookUri;

Also take note of the SerializerContext.SERIALIZER_uriResolution, SerializerContext.SERIALIZER_uriRelativity, and and SerializerContext.SERIALIZER_uriContext settings that can be specified on the serializer to resolve relative and context-root-relative URIs to fully-qualified URIs.

This can be useful if you want to keep the URI authority and context root information out of the bean logic layer.

The following code produces the same output as before, but the URIs on the beans are relative.

// Create a new serializer with readable output. RdfSerializer s = new RdfSerializerBuilder() .xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3); .relativeUriBase("http://myhost/sample"); .absolutePathUriBase("http://myhost") .build(); // Create our bean. Person p = new Person(1, "John Smith", "person/1", "/"); // Serialize the bean to RDF/XML. String rdfXml = s.serialize(p);

2.3 - @Bean and @BeanProperty annotations

The Bean and BeanProperty annotations are used to customize the behavior of beans across the entire framework.
In addition to using them to identify the resource URI for the bean shown above, they have various other uses:

  • Hiding bean properties.
  • Specifying the ordering of bean properties.
  • Overriding the names of bean properties.
  • Associating transforms at both the class and property level (to convert non-serializable POJOs to serializable forms).

For example, we now add a birthDate property, and associate a swap with it to transform it to an ISO8601 date-time string in GMT time.
By default, Calendars are treated as beans by the framework, which is usually not how you want them serialized.
Using swaps, we can convert them to standardized string forms.

public class Person { // Bean properties @BeanProperty(swap=CalendarSwap.ISO8601DTZ.class) public Calendar birthDate; ... // Normal constructor public Person(int id, String name, String uri, String addressBookUri, String birthDate) throws Exception { ... this.birthDate = new GregorianCalendar(); this.birthDate.setTime( DateFormat.getDateInstance(DateFormat.MEDIUM).parse(birthDate)); } }

And we alter our code to pass in the birthdate.

// Create our bean. Person p = new Person(1, "John Smith", "http://sample/addressBook/person/1", "http://sample/addressBook", "Aug 12, 1946");

Now when we rerun the sample code, we'll get the following:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/"> <rdf:Description rdf:about="http://sample/addressBook/person/1"> <per:addressBookUri rdf:resource="http://sample/addressBook"/> <per:id>1</per:id> <per:name>John Smith</per:name> <per:birthDate>1946-08-12T00:00:00Z</per:birthDate> </rdf:Description> </rdf:RDF>

2.4 - Collections

Collections and arrays are converted to RDF sequences.
In our example, let's add a list-of-beans property to our sample class:

public class Person { // Bean properties public LinkedList<Address> addresses = new LinkedList<Address>(); ... }

The Address class has the following properties defined:

@Rdf(prefix="addr") public class Address { // Bean properties @Rdf(beanUri=true) public URI uri; public URI personUri; public int id; @Rdf(prefix="mail") public String street, city, state; @Rdf(prefix="mail") public int zip; public boolean isCurrent; }

Next, add some quick-and-dirty code to add an address to our person bean:

// Create a new serializer (revert back to namespace autodetection). RdfSerializer s = new RdfSerializerBuilder().xmlabbrev().property(RdfProperties.RDF_rdfxml_tab, 3).build(); // Create our bean. Person p = new Person(1, "John Smith", "http://sample/addressBook/person/1", "http://sample/addressBook", "Aug 12, 1946"); Address a = new Address(); a.uri = new URI("http://sample/addressBook/address/1"); a.personUri = new URI("http://sample/addressBook/person/1"); a.id = 1; a.street = "100 Main Street"; a.city = "Anywhereville"; a.state = "NY"; a.zip = 12345; a.isCurrent = true; p.addresses.add(a);

Now when we run the sample code, we get the following:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/" xmlns:mail="http://www.apache.org/mail/" xmlns:addr="http://www.apache.org/address/"> <rdf:Description rdf:about="http://sample/addressBook/person/1"> <per:addressBookUri rdf:resource="http://sample/addressBook"/> <per:id>1</per:id> <per:name>John Smith</per:name> <per:addresses> <rdf:Seq> <rdf:li> <rdf:Description rdf:about="http://sample/addressBook/address/1"> <addr:personUri rdf:resource="http://sample/addressBook/person/1"/> <addr:id>1</addr:id> <mail:street>100 Main Street</mail:street> <mail:city>Anywhereville</mail:city> <mail:state>NY</mail:state> <mail:zip>12345</mail:zip> <addr:isCurrent>true</addr:isCurrent> </rdf:Description> </rdf:li> </rdf:Seq> </per:addresses> </rdf:Description> </rdf:RDF>

2.5 - Root property

For all RDF languages, the POJO objects get broken down into simple triplets.
Unfortunately, for tree-structured data like the POJOs shown above, this causes the root node of the tree to become lost.
There is no easy way to identify that person/1 is the root node in our tree once in triplet form, and in some cases it's impossible.

By default, the RdfParser class handles this by scanning all the nodes and identifying the nodes without incoming references.
However, this is inefficient, especially for large models.
And in cases where the root node is referenced by another node in the model by URL, it's not possible to locate the root at all.

To resolve this issue, the property RdfSerializerContext.RDF_addRootProperty was introduced.
When enabled, this adds a special root attribute to the root node to make it easy to locate by the parser.

To enable, set the RDF_addRootProperty property to true on the serializer:

// Create a new serializer. RdfSerializer s = new RdfSerializerBuilder() .xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3), .addRootProperty(true) .build();

Now when we rerun the sample code, we'll see the added root attribute on the root resource.

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/" xmlns:mail="http://www.apache.org/mail/" xmlns:addr="http://www.apache.org/address/"> <rdf:Description rdf:about="http://sample/addressBook/person/1"> <j:root>true</j:root> <per:addressBookUri rdf:resource="http://sample/addressBook"/> <per:id>1</per:id> <per:name>John Smith</per:name> <per:addresses> <rdf:Seq> <rdf:li> <rdf:Description rdf:about="http://sample/addressBook/address/1"> <addr:personUri rdf:resource="http://sample/addressBook/person/1"/> <addr:id>1</addr:id> <mail:street>100 Main Street</mail:street> <mail:city>Anywhereville</mail:city> <mail:state>NY</mail:state> <mail:zip>12345</mail:zip> <addr:isCurrent>true</addr:isCurrent> </rdf:Description> </rdf:li> </rdf:Seq> </per:addresses> </rdf:Description> </rdf:RDF>

2.6 - Typed literals

XML-Schema data-types can be added to non-String literals through the RdfSerializerContext.RDF_addLiteralTypes setting.

To enable, set the RDF_addLiteralTypes property to true on the serializer:

// Create a new serializer (revert back to namespace autodetection). RdfSerializer s = new RdfSerializerBuilder() .xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3), .addLiteralTypes(true) .build();

Now when we rerun the sample code, we'll see the added root attribute on the root resource.

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j="http://www.apache.org/juneau/" xmlns:jp="http://www.apache.org/juneaubp/" xmlns:per="http://www.apache.org/person/" xmlns:mail="http://www.apache.org/mail/" xmlns:addr="http://www.apache.org/address/"> <rdf:Description rdf:about="http://sample/addressBook/person/1"> <per:addressBookUri rdf:resource="http://sample/addressBook"/> <per:id rdf:datatype="http://www.w3.org/2001/XMLSchema#int">1</per:id> <per:name>John Smith</per:name> <per:addresses> <rdf:Seq> <rdf:li> <rdf:Description rdf:about="http://sample/addressBook/address/1"> <addr:personUri rdf:resource="http://sample/addressBook/person/1"/> <addr:id rdf:datatype="http://www.w3.org/2001/XMLSchema#int">1</addr:id> <mail:street>100 Main Street</mail:street> <mail:city>Anywhereville</mail:city> <mail:state>NY</mail:state> <mail:zip rdf:datatype="http://www.w3.org/2001/XMLSchema#int">12345</mail:zip> <addr:isCurrent rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</addr:isCurrent> </rdf:Description> </rdf:li> </rdf:Seq> </per:addresses> </rdf:Description> </rdf:RDF>

2.7 - Non-tree models and recursion detection

The RDF serializer is designed to be used against tree structures.
It expects that there not be loops in the POJO model (e.g. children with references to parents, etc...).
If you try to serialize models with loops, you will usually cause a StackOverflowError to be thrown (if SerializerContext.SERIALIZER_maxDepth is not reached first).

If you still want to use the XML serializer on such models, Juneau provides the SerializerContext.SERIALIZER_detectRecursions setting.
It tells the serializer to look for instances of an object in the current branch of the tree and skip serialization when a duplicate is encountered.

Recursion detection introduces a performance penalty of around 20%.
For this reason the setting is disabled by default.

2.8 - Configurable properties

See the following classes for all configurable properties that can be used on this serializer:

2.9 - Other notes

  • Like all other Juneau serializers, the RDF serializer is thread safe and maintains an internal cache of bean classes encountered. For performance reasons, it's recommended that serializers be reused whenever possible instead of always creating new instances.

3 - RdfParser class

The RdfParser class is the top-level class for all Jena-based parsers.
Language-specific parsers are defined as inner subclasses of the RdfParser class:

The RdfParser.Xml parser handles both regular and abbreviated RDF/XML.

Static reusable instances of parsers are also provided with default settings:

For an example, we will build upon the previous example and parse the generated RDF/XML back into the original bean.

// Create a new serializer with readable output. RdfSerializer s = new RdfSerializerBuilder() .xmlabbrev() .property(RdfProperties.RDF_rdfxml_tab, 3) .addRootProperty(true) .build(); // Create our bean. Person p = new Person(1, "John Smith", "http://sample/addressBook/person/1", "http://sample/addressBook", "Aug 12, 1946"); Address a = new Address(); a.uri = new URI("http://sample/addressBook/address/1"); a.personUri = new URI("http://sample/addressBook/person/1"); a.id = 1; a.street = "100 Main Street"; a.city = "Anywhereville"; a.state = "NY"; a.zip = 12345; a.isCurrent = true; p.addresses.add(a); // Serialize the bean to RDF/XML. String rdfXml = s.serialize(p); // Parse it back into a bean using the reusable XML parser. p = RdfParser.DEFAULT_XML.parse(rdfXml, Person.class); // Render it as JSON. String json = JsonSerializer.DEFAULT_LAX_READABLE.serialize(p); System.err.println(json);

We print it out to JSON to show that all the data has been preserved:

{ uri: 'http://sample/addressBook/person/1', addressBookUri: 'http://sample/addressBook', id: 1, name: 'John Smith', birthDate: '1946-08-12T00:00:00Z', addresses: [ { uri: 'http://sample/addressBook/address/1', personUri: 'http://sample/addressBook/person/1', id: 1, street: '100 Main Street', city: 'Anywhereville', state: 'NY', zip: 12345, isCurrent: true } ] }

3.1 - Parsing into generic POJO models

The RDF parser is not limited to parsing back into the original bean classes.
If the bean classes are not available on the parsing side, the parser can also be used to parse into a generic model consisting of Maps, Collections, and primitive objects.

You can parse into any Map type (e.g. HashMap, TreeMap), but using ObjectMap is recommended since it has many convenience methods for converting values to various types.
The same is true when parsing collections. You can use any Collection (e.g. HashSet, LinkedList) or array (e.g. Object[], String[], String[][]), but using ObjectList is recommended.

When the map or list type is not specified, or is the abstract Map, Collection, or List types, the parser will use ObjectMap and ObjectList by default.

In the following example, we parse into an ObjectMap and use the convenience methods for performing data conversion on values in the map.

// Parse RDF into a generic POJO model. ObjectMap m = RdfParser.DEFAULT_XML.parse(rdfXml, ObjectMap.class); // Get some simple values. String name = m.getString("name"); int id = m.getInt("id"); // Get a value convertable from a String. URI uri = m.get(URI.class, "uri"); // Get a value using a swap. CalendarSwap swap = new CalendarSwap.ISO8601DTZ(); Calendar birthDate = m.get(swap, "birthDate"); // Get the addresses. ObjectList addresses = m.getObjectList("addresses"); // Get the first address and convert it to a bean. Address address = addresses.get(Address.class, 0);

However, there are caveats when parsing into generic models due to the nature of RDF.
Watch out for the following:

  • The ordering of entries are going to be inconsistent.
  • Bean URIs are always going to be denoted with the key "uri".
    Therefore, you cannot have a bean with a URI property and a separate property named "uri".
    The latter will overwrite the former.
    This isn't a problem when parsing into beans instead of generic POJO models.
  • All values are strings.
    This normally isn't a problem when using ObjectMap and ObjectList since various methods are provided for converting to the correct type anyway.
  • The results may not be what is expected if there are lots of URL reference loops in the RDF model.
    As nodes are processed from the root node down through the child nodes, the parser keeps track of previously processed parent URIs and handles them accordingly.
    If it finds that the URI has previously been processed, it handles it as a normal URI string and doesn't process further.
    However, depending on how complex the reference loops are, the parsed data may end up having the same data in it, but structured differently from the original POJO.

We can see some of these when we render the ObjectMap back to JSON.

System.err.println(JsonSerializer.DEFAULT_LAX_READABLE.serialize(m));

This is what's produced:

{ uri: 'http://sample/addressBook/person/1', addresses: [ { uri: 'http://sample/addressBook/address/1', isCurrent: 'true', zip: '12345', state: 'NY', city: 'Anywhereville', street: '100 Main Street', id: '1', personUri: 'http://sample/addressBook/person/1' } ], birthDate: '1946-08-12T00:00:00Z', addressBookUri: 'http://sample/addressBook', name: 'John Smith', id: '1', root: 'true' }

As a general rule, parsing into beans is often more efficient than parsing into generic models.
And working with beans is often less error prone than working with generic models.

3.2 - Configurable properties

See the following classes for all configurable properties that can be used on this parser:

3.3 - Other notes

  • Like all other Juneau parsers, the RDF parser is thread safe and maintains an internal cache of bean classes encountered. For performance reasons, it's recommended that parser be reused whenever possible instead of always creating new instances.

*** fín ***

Skip navigation links

Copyright © 2017 Apache. All rights reserved.