Going virtual

Development for Pentaho Data Integration has been making good headway lately.  One of the new cool things that we recently implemented is the ability to have reference source files, transformations and jobs from any location you like.

The underlying libraries we use to do that is the Apache Commons Virtual File System.

Here is a simple example that you can try with the latest dev version:

sh kitchen.sh -file:http://www.kettle.be/GenerateRows.kjb

Let’s have a look at this job in Spoon.  To open it directly from the URL above follow this procedure:

Open file from URL

Type in the url:

Selecting OK will load the job in Spoon:

The transformation we are about to launch is also located on the webserver.  The internal variable for the job name directory is:

Internal.Job.Filename.Directory    http://www.kettle.be/

This allows us to reference the transformation as follows:

Please note that if you try this yourself you’ll note that you can’t save the job back to the webserver.  That is not because we don’t support that, but because you don’t have the permission to so.

Please have a quick look at the almost endless list of possibilities over here. These include direct loading from zip-files, gz-files, jar-files, ram drives, SMB, (s)ftp, (s)http, etc.

We will extend this list even further in the near future with our own drivers for the Pentaho solutions repository and later on for the Kettle repository (something like: psr:// and pdi:// URIs)
As cool examples go, here is one to end with:

Until next time,

Matt

One comment