<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Pentaho Kettle Solutions Overview</title>
	<atom:link href="http://www.ibridge.be/?feed=rss2&#038;p=189" rel="self" type="application/rss+xml" />
	<link>http://www.ibridge.be/?p=189</link>
	<description>Venting steam after a long day of writing code...</description>
	<pubDate>Sat, 25 May 2013 21:25:23 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: Matt Casters</title>
		<link>http://www.ibridge.be/?p=189#comment-41654</link>
		<dc:creator>Matt Casters</dc:creator>
		<pubDate>Mon, 12 Sep 2011 14:30:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-41654</guid>
		<description>NDash, try again on the forum.  Thank you for your understanding.</description>
		<content:encoded><![CDATA[<p>NDash, try again on the forum.  Thank you for your understanding.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ndash</title>
		<link>http://www.ibridge.be/?p=189#comment-41653</link>
		<dc:creator>Ndash</dc:creator>
		<pubDate>Mon, 12 Sep 2011 13:58:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-41653</guid>
		<description>Matt,

Thanks for the excellent book. Though I'm a newbie to this tool, I managed to learn a lot from your book as it was very lucidly written.

I'm currently working on CDC. In chapter 6, you have described about that. While I'm able to carry out all that is mentioned in the book, I got stuck in the following.

1. After Merge Rows (Diff), how do we get the "Column Names" and their Values those are changed/altered?

Apologies for posting here, as I can't access the forum.

Thanks a zillion for your help.

Best Regards,
NDash</description>
		<content:encoded><![CDATA[<p>Matt,</p>
<p>Thanks for the excellent book. Though I&#8217;m a newbie to this tool, I managed to learn a lot from your book as it was very lucidly written.</p>
<p>I&#8217;m currently working on CDC. In chapter 6, you have described about that. While I&#8217;m able to carry out all that is mentioned in the book, I got stuck in the following.</p>
<p>1. After Merge Rows (Diff), how do we get the &#8220;Column Names&#8221; and their Values those are changed/altered?</p>
<p>Apologies for posting here, as I can&#8217;t access the forum.</p>
<p>Thanks a zillion for your help.</p>
<p>Best Regards,<br />
NDash</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sunil George</title>
		<link>http://www.ibridge.be/?p=189#comment-40185</link>
		<dc:creator>Sunil George</dc:creator>
		<pubDate>Wed, 01 Jun 2011 14:53:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-40185</guid>
		<description>Dear Matt,

Thank you so much for the guidance. I have posted my queries in the forum as you have suggested. We do have an enterprise edition so we will post the same to the pentaho support team as well.

I request your attention also to these queries if possible. At least a quick glance will be helpful for us. I am sharing the forum post link here:
http://forums.pentaho.com/showthread.php?82354-PDI-Clustering
http://forums.pentaho.com/showthread.php?82355-Generic-Transformation-for-loading-all-one-to-one-table-loading
http://forums.pentaho.com/showthread.php?82356-Parallel-execution-of-looped-Transformation-in-a-PDI-Job
http://forums.pentaho.com/showthread.php?82358-Pentaho-Enterprise-Repository

By the way I have found printing mistakes in quire a few places in the Kettle solutions book. If you want I can share those page number with you so that it can be corrected for the next edition printing or so. Please let me know.

Many thanks in advance.

Regards,
Sunil George.</description>
		<content:encoded><![CDATA[<p>Dear Matt,</p>
<p>Thank you so much for the guidance. I have posted my queries in the forum as you have suggested. We do have an enterprise edition so we will post the same to the pentaho support team as well.</p>
<p>I request your attention also to these queries if possible. At least a quick glance will be helpful for us. I am sharing the forum post link here:<br />
<a href="http://forums.pentaho.com/showthread.php?82354-PDI-Clustering" rel="nofollow">http://forums.pentaho.com/showthread.php?82354-PDI-Clustering</a><br />
<a href="http://forums.pentaho.com/showthread.php?82355-Generic-Transformation-for-loading-all-one-to-one-table-loading" rel="nofollow">http://forums.pentaho.com/showthread.php?82355-Generic-Transformation-for-loading-all-one-to-one-table-loading</a><br />
<a href="http://forums.pentaho.com/showthread.php?82356-Parallel-execution-of-looped-Transformation-in-a-PDI-Job" rel="nofollow">http://forums.pentaho.com/showthread.php?82356-Parallel-execution-of-looped-Transformation-in-a-PDI-Job</a><br />
<a href="http://forums.pentaho.com/showthread.php?82358-Pentaho-Enterprise-Repository" rel="nofollow">http://forums.pentaho.com/showthread.php?82358-Pentaho-Enterprise-Repository</a></p>
<p>By the way I have found printing mistakes in quire a few places in the Kettle solutions book. If you want I can share those page number with you so that it can be corrected for the next edition printing or so. Please let me know.</p>
<p>Many thanks in advance.</p>
<p>Regards,<br />
Sunil George.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Casters</title>
		<link>http://www.ibridge.be/?p=189#comment-40147</link>
		<dc:creator>Matt Casters</dc:creator>
		<pubDate>Sun, 29 May 2011 17:57:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-40147</guid>
		<description>Hi Sunil,

It's great to hear that you managed to buy our book, thanks for your support.
As far as free Kettle support for your questions is concerned, it's best that you try the Kettle forums:

http://forums.pentaho.com/forumdisplay.php?135

Best of luck with your projects!

Matt</description>
		<content:encoded><![CDATA[<p>Hi Sunil,</p>
<p>It&#8217;s great to hear that you managed to buy our book, thanks for your support.<br />
As far as free Kettle support for your questions is concerned, it&#8217;s best that you try the Kettle forums:</p>
<p><a href="http://forums.pentaho.com/forumdisplay.php?135" rel="nofollow">http://forums.pentaho.com/forumdisplay.php?135</a></p>
<p>Best of luck with your projects!</p>
<p>Matt</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sunil George</title>
		<link>http://www.ibridge.be/?p=189#comment-40146</link>
		<dc:creator>Sunil George</dc:creator>
		<pubDate>Sun, 29 May 2011 15:27:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-40146</guid>
		<description>Dear Matt,

I managed to get the book shipped from Singapore Wiley shop. The book is nicely written and it covers many parts required for Pentaho ETL beginners. I have few queries for you. I really expect your valid comments and suggestions for the same.

1. Where we can find more documentation regarding Pentaho Enterprise Repository? Is this the best option for a real project execution where we have multiple poeple working for a project. I heard that it manages referential integrity also. So if we change the name of a transformation and if that transformation is referenced by a Job will the repository handles this referential integrity automatically. Do people has access to this Repository file path? or is it managed only by Data integration server? For a lineage analysis in a transformation is it possible for a File input step to connect to this Enterprise repository and reads the .ktr files as we do using a normal file system repository path? Or do we need to export the transformations from repository to XML is the only option?

2. We have a requirement to create a txt file output extracting data from a number of tables. The transformations has just 3 steps. Table input Step, a Javascript step to remove all newline and carriage return characters if present in any fields and a Text file output step. Say if we have 16 tables we can do this as 16 transformations. The source query SQL is not just a Select * type as it is needed to do some Case and type cast changes due to difference in data types in source and target. What we have finalised is to have a generic transformation instead of 16. What we did is we created a transformation T1 which will do a query in the source db for the supplied table name and will take the table column names and data types. The Select fields are then build as a string with required case and type casts applied in T1 and passed to T2 as a variable. T2 gets this variable and uses it in the Table input step in the SQL query. We have a Javascript step where I have managed to do the cleaning process with the help of getInputRowmeta() method. Created a for loop and for each string type fields will do the cleaning process and finally the existing value is modified like row[i]= ....; Wanted to know whether this is a correct approach as we are not using javascript step specifying the input fields names and not using var and Get output Variables options inside the Javascript step since we need to generalize it.

3. Finally for the above project I can have a Master job where there is a Transformation T0 which will read a SOR table which has the 16 source table names. So the output of T0 is 16 rows and we store that to Copy to Result Step. Then we connect this to another sub job which has T1 and T2. The sub job will be set to run for each row output of T0. I hope this will do the job for us. The query that I have here is this job looping for 16 rows is a sequential thing? Or is it possible for me to parallelise it? The sub job will get receive a table name and will create the sql and outputs the txt file. Since it is a single Transformation that is going to run for all 16 table extraction how Can I parallelise it? 

Many thanks in Advance. Looking forward for your valuable comments.

Regards,
Sunil George</description>
		<content:encoded><![CDATA[<p>Dear Matt,</p>
<p>I managed to get the book shipped from Singapore Wiley shop. The book is nicely written and it covers many parts required for Pentaho ETL beginners. I have few queries for you. I really expect your valid comments and suggestions for the same.</p>
<p>1. Where we can find more documentation regarding Pentaho Enterprise Repository? Is this the best option for a real project execution where we have multiple poeple working for a project. I heard that it manages referential integrity also. So if we change the name of a transformation and if that transformation is referenced by a Job will the repository handles this referential integrity automatically. Do people has access to this Repository file path? or is it managed only by Data integration server? For a lineage analysis in a transformation is it possible for a File input step to connect to this Enterprise repository and reads the .ktr files as we do using a normal file system repository path? Or do we need to export the transformations from repository to XML is the only option?</p>
<p>2. We have a requirement to create a txt file output extracting data from a number of tables. The transformations has just 3 steps. Table input Step, a Javascript step to remove all newline and carriage return characters if present in any fields and a Text file output step. Say if we have 16 tables we can do this as 16 transformations. The source query SQL is not just a Select * type as it is needed to do some Case and type cast changes due to difference in data types in source and target. What we have finalised is to have a generic transformation instead of 16. What we did is we created a transformation T1 which will do a query in the source db for the supplied table name and will take the table column names and data types. The Select fields are then build as a string with required case and type casts applied in T1 and passed to T2 as a variable. T2 gets this variable and uses it in the Table input step in the SQL query. We have a Javascript step where I have managed to do the cleaning process with the help of getInputRowmeta() method. Created a for loop and for each string type fields will do the cleaning process and finally the existing value is modified like row[i]= &#8230;.; Wanted to know whether this is a correct approach as we are not using javascript step specifying the input fields names and not using var and Get output Variables options inside the Javascript step since we need to generalize it.</p>
<p>3. Finally for the above project I can have a Master job where there is a Transformation T0 which will read a SOR table which has the 16 source table names. So the output of T0 is 16 rows and we store that to Copy to Result Step. Then we connect this to another sub job which has T1 and T2. The sub job will be set to run for each row output of T0. I hope this will do the job for us. The query that I have here is this job looping for 16 rows is a sequential thing? Or is it possible for me to parallelise it? The sub job will get receive a table name and will create the sql and outputs the txt file. Since it is a single Transformation that is going to run for all 16 table extraction how Can I parallelise it? </p>
<p>Many thanks in Advance. Looking forward for your valuable comments.</p>
<p>Regards,<br />
Sunil George</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lazer Epilasyon Adana</title>
		<link>http://www.ibridge.be/?p=189#comment-38373</link>
		<dc:creator>Lazer Epilasyon Adana</dc:creator>
		<pubDate>Fri, 18 Feb 2011 13:44:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-38373</guid>
		<description>I mean if we translate the mantle:org/pentaho/mantle/public/messages/messages and save it, we also want to put it directly in the corresponding directory in the server so we can use it right away (and of course share it with the community.)</description>
		<content:encoded><![CDATA[<p>I mean if we translate the mantle:org/pentaho/mantle/public/messages/messages and save it, we also want to put it directly in the corresponding directory in the server so we can use it right away (and of course share it with the community.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Casters</title>
		<link>http://www.ibridge.be/?p=189#comment-37761</link>
		<dc:creator>Matt Casters</dc:creator>
		<pubDate>Wed, 12 Jan 2011 08:20:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-37761</guid>
		<description>Hi Sunil,

Questions regarding PDI itself are best placed on the Kettle forum.  As a short reply I'm working on providing better support restartability for version 4.3.  The logic you mention is precisely what we had in mind since we have log tables for job entries now.

Regards,

Matt</description>
		<content:encoded><![CDATA[<p>Hi Sunil,</p>
<p>Questions regarding PDI itself are best placed on the Kettle forum.  As a short reply I&#8217;m working on providing better support restartability for version 4.3.  The logic you mention is precisely what we had in mind since we have log tables for job entries now.</p>
<p>Regards,</p>
<p>Matt</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sunil George</title>
		<link>http://www.ibridge.be/?p=189#comment-37760</link>
		<dc:creator>Sunil George</dc:creator>
		<pubDate>Wed, 12 Jan 2011 07:48:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-37760</guid>
		<description>Thank Matt for your quick reply. I have contacted Wiley Bangalore(India) sales person and he gave me a contact person near to my place. They told me that they have ordered the book from their Singapore store. Need to wait for one month for the shipment I guess. I had gone through the PDI 3.2 Beginner's guide earlier and I am very much interested in knowing and learning more about Kettle and I am sure that the new Book will have enough practical examples related to DW for us.

Does the book covers how we design a dimensional model from an ER model and how Kettle steps can be used for creating a transformation that will load the FACT and DIMENSIONS tables?

I had a question from one person recently regarding Kettle and the question was "If we have a Job that consists of lot of transformation calls and due to some reason one of the transformation got aborted. Now we need to re-run the job and we need to start from the aborted transformation this time skipping the rest those were executed successfully. How will you do it in PDI?". My reply was we will have to log the details after a successful execution of a transformation and these log has to be read first before decising whether that transformation within the job has to be executed or not. Is this correct?

What I have meant is we will create a folder that will hold a log file ( say in the name of transformation name) after a successful execution of the transformation. This happens for all the transformation. After the last transformation successful execution we will clear this log file in this folder. If any transformation aborts in b/w and we will have to restart the job again, before the transformation is called we will put a condition in the job to check with a log file exists in the log folder in the name of that transformation. If exists we will skip the execution of that transformation and move to the next and so on.. Is my logic correct or do we have any other straight forward option for this?


Regards,
Sunil George.</description>
		<content:encoded><![CDATA[<p>Thank Matt for your quick reply. I have contacted Wiley Bangalore(India) sales person and he gave me a contact person near to my place. They told me that they have ordered the book from their Singapore store. Need to wait for one month for the shipment I guess. I had gone through the PDI 3.2 Beginner&#8217;s guide earlier and I am very much interested in knowing and learning more about Kettle and I am sure that the new Book will have enough practical examples related to DW for us.</p>
<p>Does the book covers how we design a dimensional model from an ER model and how Kettle steps can be used for creating a transformation that will load the FACT and DIMENSIONS tables?</p>
<p>I had a question from one person recently regarding Kettle and the question was &#8220;If we have a Job that consists of lot of transformation calls and due to some reason one of the transformation got aborted. Now we need to re-run the job and we need to start from the aborted transformation this time skipping the rest those were executed successfully. How will you do it in PDI?&#8221;. My reply was we will have to log the details after a successful execution of a transformation and these log has to be read first before decising whether that transformation within the job has to be executed or not. Is this correct?</p>
<p>What I have meant is we will create a folder that will hold a log file ( say in the name of transformation name) after a successful execution of the transformation. This happens for all the transformation. After the last transformation successful execution we will clear this log file in this folder. If any transformation aborts in b/w and we will have to restart the job again, before the transformation is called we will put a condition in the job to check with a log file exists in the log folder in the name of that transformation. If exists we will skip the execution of that transformation and move to the next and so on.. Is my logic correct or do we have any other straight forward option for this?</p>
<p>Regards,<br />
Sunil George.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Casters</title>
		<link>http://www.ibridge.be/?p=189#comment-37745</link>
		<dc:creator>Matt Casters</dc:creator>
		<pubDate>Tue, 11 Jan 2011 13:43:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-37745</guid>
		<description>I'm sorry Sunil but I have no idea about shipping to India.  I would suggest to try and order the book at Amazon or any of the other online book stores.
A good technical book store (if there are any in your neighbourhood) should be able to get you Pentaho Kettle Solutions too.
Good luck,
Matt</description>
		<content:encoded><![CDATA[<p>I&#8217;m sorry Sunil but I have no idea about shipping to India.  I would suggest to try and order the book at Amazon or any of the other online book stores.<br />
A good technical book store (if there are any in your neighbourhood) should be able to get you Pentaho Kettle Solutions too.<br />
Good luck,<br />
Matt</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sunil George</title>
		<link>http://www.ibridge.be/?p=189#comment-37740</link>
		<dc:creator>Sunil George</dc:creator>
		<pubDate>Tue, 11 Jan 2011 09:01:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.ibridge.be/?p=189#comment-37740</guid>
		<description>Dear Matt,

I am looking forward to buy this book. But this book is not available in India at the moment. I searched for it in http://wileyindia.com/ but not seen. I can see the book in wiley international edition site. Is there any chance in getting this book published in India?

Regards,
Sunil George.</description>
		<content:encoded><![CDATA[<p>Dear Matt,</p>
<p>I am looking forward to buy this book. But this book is not available in India at the moment. I searched for it in <a href="http://wileyindia.com/" rel="nofollow">http://wileyindia.com/</a> but not seen. I can see the book in wiley international edition site. Is there any chance in getting this book published in India?</p>
<p>Regards,<br />
Sunil George.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
