How to run a Linked Data-Fu program in Java

The Linked Data-Fu Engine is a streaming rule/query engine.

The command-line utility provides all functionality to run programs. Use the command-line utility.

If you must use Java, the following text describes how to use the Java API in four easy steps.

  1. Generate a Program object.
  2. Register a query.
  3. Create an EvaluateProgram object.
  4. Evaluate the program.

We use NxParser as API for handling RDF in Java.

Generate a Program Object

A Program object contains:

You can manually create a Program object, or parse a Notation3 file:

InputStream pis = new FileInputStream(file);
Origin pbase = new FileOrigin(file.toURI());

Notation3Parser n3p = new Notation3Parser(pis);
ProgramConsumer pc = new ProgramConsumerImpl(pbase.getResource());

n3p.parse(pc);

pis.close();

You can pass the ProgramConsumer to the Program constructor to create a Program object.

Program program = new Program(pc);

Register a Query

Linked Data-Fu operates in a streaming fashion, so queries are registered, and results are provided in an ongoing fashion via a callback interface (e.g., see a poster).

Registering means to supply

There are various sink implementations, to write results to files in specific formats. If you want to handle the results to queries in Java code, use a BindingConsumerSink, which takes as parameter a BindingConsumer. A callback implements BindingConsumer, which encapsulates several Node objects and an Origin for provenance tracking.

There are several classes that implement BindingConsumer:

For example, assume you want to register the following SPARQL query to the program.

SELECT ?s WHERE { ?s ?p ?o . }

You can build the query in Java. "#q" is the internal identifier for the query (not used, really).

Nodes pattern = new Nodes(new Variable("s"), new Variable("p"), new Variable("o"));
List li = new ArrayList();
li.add(new Variable("s"));
SelectQuery sq = new SelectQuery(new Resource("#q"), li, pattern);

You can also use the SPARQL parser to obtain a query object:

QueryConsumerImpl qc = new QueryConsumerImpl();
String s = new String("SELECT ?s WHERE { ?s ?p ?o . }");
SparqlParser sp = new SparqlParser(new StringReader(s));

sp.parse(qc, new InternalOrigin("SparqlSelectTest"));
SelectQuery sq = qc.getSelectQueries().iterator().next();

Now, do:

BindingConsumerCollection bc = new BindingConsumerCollection();
BindingConsumerSink sink = new BindingConsumerSink(bc);

program.registerSelectQuery(sq, sink);

Now you have prepared everything you need to evaluate the program.

Create an EvaluateProgram Object

EvaluateProgram represents one program run, or, in other words, the executable plan. You can pass the Program object to the constructor of EvaluateProgramGenerator with the appropriate config in a EvaluateProgramConfig object.

In the following we assume the default configuration.

EvaluateProgramConfig config = new EvaluateProgramConfig();
EvaluateProgram ep = new EvaluateProgramGenerator(program, config);

Evaluate the Program

Once you call start() on the EvaluateProgram object, the program will be evaluated and results for the registered queries will be streamed to the registered callbacks.

ep.start();

After calling start(), you may add triples (in the form of Binding objects) to the running program via consume() on the BindingConsumer object you get via getBaseConsumer().

With the following Java code, you add the triple:

<http://example.org/foo> rdf:type rdfs:Resource .
ep.getBaseConsumer().consume(new Binding(new Nodes(new Resource("http://example.org/foo"), RDF.TYPE, RDFS.RESOURCE));

You may also add safe (GET) requests to the OriginConsumer you get via getInputOriginConsumer(), and unsafe (PUT, POST, DELETE) requests to the OriginConsumer you get via getOutputOriginConsumer().

To parse triples from a file, use an atomic request so that the parser can take care of parsing different RDF syntaxes.

ep.getInputOriginConsumer().consume(new RequestOrigin(new URI("http://harth.org/andreas/foaf.rdf"), Method.GET);

Once you have supplied all triples and requests, call awaitIdleAndFinish(). The call to awaitIdleAndFinish() will block until the program evaluation has been finished.

ep.awaitIdleAndFinish();
ep.shutdown();

Invoking the shutdown() method closes streams and network connections.

That's all.

The Linked Data-Fu command line utility provides access to all functionality you might need to run programs.

In the majority of cases, you do not need to use the Java API, and you should not use the Java API.

Use the command line utility.


Andreas Harth, May 2015, December 2015.