Catching up with Kettle REMIX

Dear Kettle and Neo4j friends,

Since I joined the Neo4j team in April I haven’t given you any updates despite the fact that a lot of activity has been taking place in both the Neo4j and Kettle realms.

First and foremost, you can grab the cool Neo4j plugins from neo4j.kettle.be (the plugin in the marketplace is always out of date since it takes weeks to update the metadata).

Then based on valuable feedback from community members we’ve updated the DataSet plugin (including unit testing) to include relative paths for filenames (for easier git support), to avoid modifying transformation metadata and to set custom variables or parameters.

Kettle unit testing ready for prime time

I’ve also created a plugin to debug transformations and jobs a bit easier.  You can do things like set specific logging levels on steps (or only for a few rows) and work with zoom levels.

Clicking right on a step you can choose “Logging…” and set logging specifics.

Then, back on the subject of Neo4j, I’ve created a plugin to log the execution results of transformations and jobs (and a bit of their metadata) to Neo4j.

Graph of a transformation executing a bunch of steps. Metadata on the left, logging nodes on the right.

Those working with Azure might enjoy the Event Hubs plugins for a bit of data streaming action in Kettle.

The Kettle Needful Things plugin aims to fix bugs and solve silly problems in Kettle.  For now it sets the correct local metastore on Carte servers AND… features a new launcher script called MaitreMaitre supports transformations and jobs, local, remote and clustered execution.

The Kettle Environment plugin aims to take a stab at life-cycle management by allowing you to define a list of Environments:

The Environments dialog shown at the start of Spoon

In each Environment you can set all sorts of metadata but also the location of the Kettle and MetaStore home folders.

Finally, because downloading, patching, installing and configuring all this is a lot of work, I’ve created an automated process which does this for you on a daily bases (for testing) and so you can download Kettle Community Edition version 8.1.0.0 patched to 8.1.0.4 with all the extra plugins above in its 1GB glory at : remix.kettle.be

To get it on your machine simply run:

wget remix.kettle.be -O remix.zip

You can also give these plugins (Except for Needful-things and Environment) a try live on my sandbox WebSpoon server.  You can easily run your own WebSpoon from the also daily updated docker container.

If you have suggestions, bugs, rants, please feel free to leave them here or in the respective github projects.  Any feedback is as always more than welcome.  In fact, thanks you all for the feedback given so far.  It’s making all the difference.  If you feel the need to contribute more opinions on the subjects of Kettle feel free to send me a mail (mattcasters at gmail dot com) to join our kettle-community Slack channel.

Enjoy!

Matt

One comment

  • Great to see you blogging again. I notice you make reference to the 1GB Kettle download. Kettle was my go-to tool for ad hoc data imports and migrations. I unfortunately never had the opportunity to deploy it as ETL tool in production. Imagine my surprise when I went to get the latest CE edition because I needed to import some files into a database and I was confronted with the 1GB download.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.