Kettle / PDI

During the last couple of years, Pentaho Data Integration (PDI) a.k.a. Kettle has become one of the leading ETL tools.  Here are a few useful or memorable links to things I wrote on my blog about Kettle…

Here are a few additional interesting links:

Here are a number of things I found interesting lately:

If you have other interesting Kettle/PDI related links, feel free to comment


  • yxskkk

    Hi,matt,First,thanks to reply my thread on
    I want to talk to you.Can I?
    I’m chinese,want to lrean kettle from you ^-^

  • Pingback: Open Source Metrics and Benchmarks « Gobán Saor

  • Virendra Rathore


    I’ve been using Kettle 2.2 till now. Also, there were plugins developed by others.
    I’ve downloaded PDI 3.0.4. But, most of the existing transforms are not working in PDI 3.
    Is there a way to use/access those plugins in PDI 3.0.4 ?
    How to migrate those plugins into PDI3 ?

    Please help, thnx.


  • Hi Matt,

    I have also setup Kettle wiki for Pentaho community in Indonesia. You can check it at



  • bambam

    hi matt,

    been trying to access our proprietary DB with kettle, except that we can’t find the supported DB in the list ( our vendor told us they are using PICK database except that we don’t have any idea about it. are you familiar what DB is being used by tigerlogic corporation? their website is (

    thanks for having this useful website.

  • Hi Bambam,

    Unfortunately I’ve never heard about that database. That doesn’t mean much though. There are new databases popping up all the time, usually PostgreSQL, MySQL, etc clones but others as well.
    Ask your vendor for a JDBC or ODBC driver and we can work together to create the driver in Kettle. File a feature request at


  • YeXiangJie

    hi matt,
    Some of the kettle I convert smth into Chinese work , but do not know how to submit this infomation to your organization’s projects. Can you tell me how to do it?

  • Hi YeXiangJie,

    Create a JIRA case describing the improvements and contributions you did and we’ll make sure it finds the right place.

    Thanks in advance,


  • Joe-1

    Hello Guys,
    Can anyone tell me where can i find release notes of pdi 4.2 stable ? Thanks in advance

  • YeXiangJie

    hi matt,
    our project needs the Kettle,and i integrated it in our project.i use the next math to run the Job,but i don’t know how to stop the cen tell me how to stop this Job?
    this is my run the Job src:
    LogWriter log=LogWriter.getInstance(“KettleTest.log”, true, LogWriter.LOG_LEVEL_DETAILED);
    UserInfo userInfo=new UserInfo();
    DatabaseMeta connection=new DatabaseMeta(“”, “Oracle”, “Native”, “”, “orcl”, “1521”, “sspa”, “sspa”);
    RepositoryMeta repinfo=new RepositoryMeta();
    Repository rep=new Repository(log, repinfo, userInfo);
    RepositoryDirectory dir=new RepositoryDirectory(rep);
    StepLoader steploader=StepLoader.getInstance();
    JobMeta jobMeta=new JobMeta(log, rep,”ceshiJob”, dir);
    Job job=new Job(log, steploader, rep, jobMeta);

  • YeXiangJie

    hi matt,
    i don’t know how to stop kettle’s jobs use the kettle api ,could you tell me how to do this.

  • Hari

    Hi Matt,

    I have read Pentaho Kettle Solution book and it was awesome. Is there any chance for a new edition to be released ?


  • Hari

    Hi Matt,

    Is there any chance for a new edition of Pentaho Kettle Solution book to be released ?


  • Sanjeev Menon


    I am looking at resources or samples to help me guide me to connect a PDI transformation from within a C# page. Appreciate any help.


  • Kyungin Kim

    Hi, Matt

    I want add to my custom kettle plugin to PDI CE 5
    but, I can’t find sample.

    now I built the PDI5 CE[kettle] on eclipse.
    I am looking at the guide of configurating a custom plugin to Kettle project on eclipse.

    do you help me?

  • Koushik

    Hi Matt

    I have been looking into your responses and they were really helpful and solved a lot of my pentaho issues. Recently, I have come up with this problem of using big data plugin like hadoop file output in my ktr. The ktr runs fines when I run from the spoon. But when I try to run from my java code, I get the following missing plugin error.

    2015/11/18 13:47:23 – Property_Validation – ERROR (version, build 1 from 2015-01-20_19-50-27 by buildguy) : org.pentaho.di.core.exception.KettleException:
    2015/11/18 13:47:23 – Property_Validation – Unexpected error during transformation metadata load
    2015/11/18 13:47:23 – Property_Validation –
    2015/11/18 13:47:23 – Property_Validation – Missing plugins found while loading a transformation
    2015/11/18 13:47:23 – Property_Validation –
    2015/11/18 13:47:23 – Property_Validation – Step : HadoopFileOutputPlugin
    2015/11/18 13:47:23 – Property_Validation –
    2015/11/18 13:47:23 – Property_Validation – at org.pentaho.di.job.entries.trans.JobEntryTrans.getTransMeta(
    2015/11/18 13:47:23 – Property_Validation – at org.pentaho.di.job.entries.trans.JobEntryTrans.execute(
    2015/11/18 13:47:23 – Property_Validation – at org.pentaho.di.job.Job.execute(
    2015/11/18 13:47:23 – Property_Validation – at org.pentaho.di.job.Job.access$000(
    2015/11/18 13:47:23 – Property_Validation – at org.pentaho.di.job.Job$
    2015/11/18 13:47:23 – Property_Validation – at
    2015/11/18 13:47:23 – Property_Validation – Caused by: org.pentaho.di.core.exception.KettleMissingPluginsException:
    2015/11/18 13:47:23 – Property_Validation – Missing plugins found while loading a transformation
    2015/11/18 13:47:23 – Property_Validation –

    I have already added the below big data plugin dependency into my pom.xml file but still I get the above error. Do I need to register this plugin in the java code ? If so, how do I do it ?

    Some suggestions from forums were to copy the plugins folder into my current java working directory and set the System.getProperty(“KETTLE_PLUGIN_BASE_FOLDERS”,”current working directory path”) but still i get the same below error.

    I am using the following java code to trigger the job. It works fine without the hadoop input but once I add the hadoop input in my ktr, i get the above error.


    Please give us any suggestions or tips to get past this issue.


  • ???

    hello ,i don’t know how to stop kettle’s jobs use the kettle api ,could you tell me how to do this.

  • Shankey

    Can PDI / Kettle copy the data from One cluster to other ? If yes, can you please explain how?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.