Publishing Ontologies in Tomcat with Content Negotiation

Publishing ontologies, vocabularies and data by following Recipe 3 (Extended configuration for a 'hash namespace') and by respecting the basic rules of linked data is becoming more and more a "must" for any (Semantic) Web developer. Doing so and setting up the rules for the correct content negotiation in Apache is quite straightforward, since a couple of lines in the .htaccess file will do.

However, most of Web developers produce their prototypes and experiments as web applications running in servlet containers like the very popular Tomcat, which does not have the powerful .htaccess configuration. Thus the question: how is it possible to simply publish ontologies in Tomcat?

I tries to answer this (simple) queston and I came up with a small but useful web application that allows for that. You can try it yourself by downloading the sample ontologies.war file and putting it into your Tomcat. Hereafter I briefly explain what I did and how you can add your own ontologies and vocabularies.

Step 1: Prepare your machine-readable ontology

First of all, of course, you have to model your ontology. You can use your favorite ontology editor or, if you are brave, you can write it by hand. In the end, you should come up with a file expressed in RDF/XML syntax and you should name the file like "my-ontology.rdf". The machine-readable version of your ontology is ready!

If, like me, you are used to write stuff in N3/Turtle syntax, you can simply convert it into RDF/XML by using the rdfabout validator (which is, of course, very useful for validation as well...).

A final note on your ontology namespace. The example in the war archive follows a simple approach: if your Tomcat is online at "http://www.example.com/" and your ontology is named "my-ontology", you should set your namespace to "http://www.example.com/ontologies/my-ontology#" (being "ontologies" the name of my sample web application). Of course, different policies can be adopted; this is just a suggestion for a straghtforward solution which works with my example.

Step 2: Prepare a human-readable version of the ontology

Next, you should create a HTML page which describes your ontology for the humar readers. Again, you can do it by hand, so that you can add all the explanations and comments you like, or you can generate it automatically from the RDF/XML version by means of XSL transformation or other tools. The final result should be a file named "my-ontology.html"; bear in mind that the RDF/XML file and the HTML file should share the same name (but the extension!). Now also the human-readable version of your ontology is ready!

I just want to add that I recently discovered the Rhizomik ReDeFer tools which provides an RDF2HTML service producing a very nice HTML representation of an RDF/XML file, including RDFa annotations. Of course, the more you add labels, comments and descriptions to your machine-readable ontology, the nicer the HTML result. I used this tool to generate a draft HTML version of my sample ontology (see the war file), which I manually edited to add some descriptions.

Step 3: Put everything on-line

An now we are ready to put everything online. Just take my sample war file, put it into your Tomcat webapps folder and restart Tomcat; this will create a web application named "ontologies" within your Tomcat. Add both the RDF/XML version and the HTML version of your ontology in the root of the newly created web application and restart it. Now point your browser to "http://www.example.com/ontologies/my-ontology" (of course, replace the previous URL with the right namespace...) and you'll see that it redirects to "http://www.example.com/ontologies/my-ontology.html". Congratulations, your ontology is published correctly!

If you use cURL (or other tools that lets you set the HTTP request headers), you can verify it as follows:

curl http://www.example.com/ontologies/my-ontology  -H "Accept: application/rdf+xml" -I -L
curl http://www.example.com/ontologies/my-ontology  -H "Accept: text/xml" -I -L

where you can test different headers (with the -H option) and display the information about the requested resource (with the -I option) following all redirections (with the -L option). You can also validate your output by using the Vapour Linked Data validator!

Configuring the Content Negotiation

OK, but how does it work? What "magics" is behind the web application in Tomcat? Nothing magic, just the "Tomcat version" of the Apache mod_rewrite, which is a modified version of the UrlRewriteFilter by Paul Tuckey. I simply added the modified urlrewritefilter-linkeddata-4.0.5-SNAPSHOT.jar to the web application, edited the web.xml file to add the filter and configured the urlrewrite.xml file to set up the url re-writing rules. I did it already in the files included in the sample war file, so you don't need to do this again; however, if you want to replicate this behavior or configure something else, here is where to have a look.

As explained on the UrlRewriteFiler website, you should modify the web.xml file of the web application by adding the filter as follows:

<filter>
	<filter-name>UrlRewriteFilter</filter-name>
	<filter-class>org.tuckey.web.filters.urlrewrite.UrlRewriteFilter</filter-class>
</filter>
<filter-mapping>
	<filter-name>UrlRewriteFilter</filter-name>
	<url-pattern>/*</url-pattern>
</filter-mapping>

The urlrewrite.xml file is where the content negotiation rewrite rules are defined. I inserted two rules: one is activated when the HTTP request asks for an "application/rdf+xml" format and makes the application redirect to the .rdf file; the second one is for all the other cases and makes the application redirects to the .html file. (If the first one is activated, the second is ignored, because the rules are ordered.)

So the first rules simply says: 'if the requested URL ends with a string which doesn't contain a dot "." character and the HTTP requests accepts only "application/rdf+xml, than redirect to the same URL and add a ".rdf" at the end of the URL.'. The rule looks like the following:

<rule>
	<note>if the user is a "machine", give it the RDF/XML directly</note>
	<condition type="header" name="Accept">application/rdf\+xml</condition>		
	<from>(^[^\.]+$)</from>
	<to type="seeother-redirect" last="true">%{request-url}.rdf</to>
</rule>

The second rule is very similar; since we would like the default behavior to return the HTML version, the condition is activated whatever the HTTP accept header (but "application/rdf+xml" which matches the first rule). The rule looks like the following:

<rule>
	<note>if the user is a "human", give him/her the HTML page</note>
	<condition type="header" name="Accept">.*</condition>
	<from>(^[^\.]+$)</from>
	<to type="seeother-redirect">%{request-url}.html</to>
</rule>

So, why did I need to modify the original UrlRewriteFilter library? Because by default UrlRewriteFilter lets the user choose between "permanent-redirect" (301) and "redirect"/"temporary-redirect" (302), while the linked data publishing best practice recommends to use "303 - See other". Luckily, Doug Whitehead added an issue to UrlRewriteFilter to solve the missing 303 problem and attached the modified sources to his request. I played a bit with the code and produced the urlrewritefilter-linkeddata-4.0.5-SNAPSHOT.jar library. Thus, now the urlrewrite.xml files let users declare the "303 see other" redirect (cf. the <to> tag in the rules above that indicate the "seeother-redirect" type.

That's all! If you want to modify the standard behavior, you should edit the rules above in the urlrewrite.xml file. And you can add how many ontologies you like to the web application: it will manage them all.

If you have any suggestion, improvement or correction, feel free to contact me and I'll be happy to update my experiment.

 

June 28th, 2013