May 6th 2010 08:21 pm
Book Review : Pentaho 3.2 Data Integration
Dear Kettle fans,
A few weeks ago, when I was stuck in the US after the MySQL User Conference, a new book was published by Packt Publishing.
That all by itself is something that is not too remarkable. However, this time it’s a book about my brainchild Kettle. That makes this book very special to me. The full title is Pentaho 3.2 Data Integration : Beginner’s Guide (Amazon, Packt). The title all by itself explains the purpose of this book: give the reader a quick-start when it comes to Pentaho Data Integration (Kettle).

The author María Carina Roldán (blog, twitter) is a seasoned BI consultant and a valued member of the Kettle community. Besides her frequent appearances on our forum, she is appreciated by many for the time she spent on the Kettle Tutorial.
I’m not going to go over the detailed table of content. Since I wrote the foreword of the book, I’m sure you’ll agree I’m somewhat biased. However, in all objectivity, the book covers what it claims to cover: it does help the PDI/Kettle beginner tremendously. It covers all you need to get started and then some: the installation of PDI, the typical “Hello World” setup of PDI, reading text files, calculating, scripting, databases, repositories, etc. As the title indicates, this book covers the current 3.2 stable release of Kettle, not the upcoming 4.0 release. However, for as far as 99% of the topics covered are concerned, that shouldn’t make too much of a difference.
So obviously I can recommend this book very much. It’s a time-saver for those that are starting with PDI. For those that have dabbled with Kettle before I must say that María packed the book with nice tips and tricks so I’m sure you’ll be able to learn a thing or two.
Until next time,
Matt
6 Comments »



Como retirar a estrutura da tabela através do Pentaho Data Integration? on 07 May 2010 at 0:15 #
[…] Matt Casters on Data Integration » Book Review : Pentaho 3.2 Data Integration […]
Kettle 3.2 « Beginner’s Guide » on 10 May 2010 at 11:03 #
[…] http://www.ibridge.be/?p=185 […]
TomS on 12 May 2010 at 16:47 #
Hi Matt,
I have just finished reading this book and would like to give you a short, personal impression:
I already had gathered some experience using PDI (doing some rather small database synchronizations
along with pulling data from LDAP Servers)
The first chapters didn’t contain many new information for me, nevertheless, they provided some more
details which I didn’t come accross during “learning by using PDI”.
I especially like the chapter about the “Modified JavaScript” Step, which is one of the most powerfull steps IMHO.
The last few chapters about the datawarehouse stuff and the combination of the jobs/transformation
(especially the sub-transformation stuff) were really interesting!
All in all: Great work, I highly recommend this book to anyone who works with PDI (be it a new user or
someone who already has gathered some experience).
Cheers,
Tom
winson on 24 Jul 2010 at 14:25 #
Matt,very good ,looks very nice
robin chatterjee on 09 Aug 2010 at 8:50 #
Hi matt,
we are currently working on a project for a client using kettle 3.2 and now 4.0 I got the book for the new developers we will be onboarding. it does serve as a decent introduction to kettle , but we are now hankering for some more advanced books especially o do with plug in development and user define java classes. is there any news of that kind of book coming out any time soo ( are you writing one for example ? )
We would appreciate any pointers to books on kettle as I have only found menion of 3 and one is not yet out ( hint hint !!)
Thanks
Matt Casters on 09 Aug 2010 at 10:42 #
Hi Robin,
Around the end of September Wiley will be releasing another book written by yours truly, Roland Bouman and Jos Van Dongen:
http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470635177.html
http://www.amazon.com/Pentaho-Kettle-Solutions-Building-Integration/dp/0470635177
It will cover the advanced topics you are missing right now like extending Kettle (writing plugins), integration of Kettle, User Defined Java Class, performance tuning, and so on.
All the best,
Matt