In the next milestone build of Pentaho Data Integration (2.4.1-M1) we will be introducing advanced error handling features. (2.4.1-M1 is expected around February 19th)
We looked hard to find the easiest and most flexible way to implement this, and I think we have found a good solution.
Here is an example:
The transformation above works as follows: it generates a sequence between -1000 and 1000.Â The table is a MySQL table with a single “id” column defined as TINYINT.Â As you all know, that data type only accepts values between -128 and 127.
So what this transformation does is, it insert 256 rows into the table and divert all the others to a text file, our “error bucket”.
How can we configure this new feature?Â Simple: click right on the step where you want the error handling to take place, in this case, the “Table output” step.Â If the step supports error handling, you will see this popup menu appear:
Selecting the highlighted option will present you with a dialog that allows you to configure the error handling:
As you can see, you can not only specify the target step to which you want to direct the rows that caused an error.Â You can also include extra information in the error rows so that you know exactly what went wrong.Â In this particular case, these are the extra fields that will appear in the error values error rows:
- nrErrors: 1
- errorDescription:Â be.ibridge.kettle.core.exception.KettleDatabaseException:
Error inserting row
Data truncation: Out of range value adjusted for column ‘id’ at row 1
- errorField: empty because we can’t retrieve that information from the JDBC driver yet.
- errorCode: TOP001 (placeholder, final value TBD)
At the moment we have only equiped the “Script Values” (easy to cause errors with) and “Table Output” steps with these new capabilities.Â However, in the coming weeks, more steps will follow suit.
Until next time,