A new debugger for Kettle
We have been aligning Pentaho Data Integration to go into feature freeze in a few weeks when we’ll release 3.0.0-RC1. However, before we do so, I wanted to write a (simple) debugger. It’s important to get at least the API in there so that we can continue to build on top of that in the 3.x update releases.
How does it work? Well, suppose you have a simple transformation like this one:
We just generate empty rows and add an id from 1 to 1000. Now we want to pause the transformation and see the content of the row where
Well, that is what we made possible. Simply click on the debug icon in the toolbar:
That will open up the debug dialog:
As you can see, we can specify a condition on which the transformation is paused. We can also specify to keep the last N rows in memory before the condition was met. Pressing OK and launching the transformation in the execution dialog will then show the requested rows:
As you can see, for your convenience, the order of the rows is reversed. (most recent first) If you try this yourself, you will note in the transformation log tab that the transformation you are debugging is paused. That means that you can now hit the resume button and the transformation will simply continue to run. If a condition is met again, the transformation will be paused again and another preview dialog is presented.
The old-style preview has also been converted to the new pause/resume capabilities.
One interesting observation is that the performance hit while running in debugging or preview mode has been kept very low.Â The slowdown obviously depends on the number of conditions and the buffer sizes, but typically I think you will not experience any performance drop at all.
The Pentaho Data Integration development team and I really hope that these new capabilities will shorten your time to hunt down complex transformations.
Until next time,