Archive for September, 2007

September 27th 2007

Help OpenMRS!!!

My friend and colleague Julian Hyde of Mondrian fame just blogged about this: help out the OpenMRS project , please!

The folks behind the OpenMRS are helping to improve the health-care systems in developing countries. More in particular, they are fighting AIDS with this software. OpenMRS has certainly shown to be up to the task at hand: it is currently tracking the medical conditions of over a million people in 12 countries.

Because of the exponential growth of users, this project is in urgent need of BI manpower. Julian and myself have both agreed to help out with strategical advice for the BI part of OpenMRS.

If you want to be part of the team, if you know a bit about the Pentaho Platform, Pentaho Data Integration (Kettle), Pentaho Analysis (Mondrian), MySQL and/or reporting, let us know the sooner the better and help us make the world a better place. Maybe you are a student or someone with limited income and maybe you are looking for an opportunity to brush up your BI skills. In that case the OpenMRS project can offer you some money to help you out for a few months too.

Please help Julian and me spread the word about this fantastic win-win opportunity.

Come work with us to help OpenMRS create a BI solution that saves lives.

Thank you in advance,

Matt

2 Comments »

September 26th 2007

Double-click fun

If you are running Microsoft Windows as your operating system, feel free to use the new and improved Pentaho Data Integration installer for version 2.5.1GA.

Matt

4 Comments »

September 19th 2007

A new debugger for Kettle

We have been aligning Pentaho Data Integration to go into feature freeze in a few weeks when we’ll release 3.0.0-RC1. However, before we do so, I wanted to write a (simple) debugger. It’s important to get at least the API in there so that we can continue to build on top of that in the 3.x update releases.

How does it work? Well, suppose you have a simple transformation like this one:

A simple transformation

We just generate empty rows and add an id from 1 to 1000. Now we want to pause the transformation and see the content of the row where

id=387

Well, that is what we made possible. Simply click on the debug icon in the toolbar:

Debug icon in the toolbar

That will open up the debug dialog:

The new debug window
As you can see, we can specify a condition on which the transformation is paused. We can also specify to keep the last N rows in memory before the condition was met. Pressing OK and launching the transformation in the execution dialog will then show the requested rows:

Previewing rows

As you can see, for your convenience, the order of the rows is reversed. (most recent first) If you try this yourself, you will note in the transformation log tab that the transformation you are debugging is paused. That means that you can now hit the resume button and the transformation will simply continue to run. If a condition is met again, the transformation will be paused again and another preview dialog is presented.

The old-style preview has also been converted to the new pause/resume capabilities.

One interesting observation is that the performance hit while running in debugging or preview mode has been kept very low.  The slowdown obviously depends on the number of conditions and the buffer sizes, but typically I think you will not experience any performance drop at all.
The Pentaho Data Integration development team and I really hope that these new capabilities will shorten your time to hunt down complex transformations.

Until next time,

Matt

4 Comments »

September 12th 2007

Back up a little…

Working with Linux distribution Kubuntu full time for the last 4 months has been a pleasure compared to the years I worked with Windows (2000/XP). To pick one example I have always been struggling with on Windows: Backup & Recovery.Because I’m doing all sorts of things on my Linux laptop, I also have a LOT of files hanging around: Test files, transformations, jobs, test-code, branches, images, screen-caps, screen shots, +50k e-mails, the list goes on.

Originally I thought about scheduling a big “cp -pR” to an external hard disk, but I was pleasantly surprised that Kubuntu ships with Keep by default.

Keep backup

Using this program I have been synchronizing my home folder (/home/matt) to an external hard disk. It works using rsync and starts automatically around 23:00 every 2 days.

However… what about the software, the configuration files, the fixes, the updates and in general all the customizations that I did to my system? What about those? Well, I came across a piece of software called Mondo Rescue.

Mondo backup

Mondo Rescue can be installed on your Ubuntu system using the Adept software installer or with the command:

sudo apt-get install mondo

What Mondo does is create a backup of your complete system. Basically it puts all the information from your hard disk onto CD/DVD and even makes the first CD or DVD booteable. That way I can re-create my laptop from bare-metal in case of a HD crash, theft, upgrade problems or regular plain old stupidity.

Command mondoarchive allows you to specify where you want to backup the data to (HD, CD, DVD) and the directories you don’t want to see included.

Be aware that the mondoarchive backup will take many hours to run and you typically fire this up during the night using cron. mondoarchive can be run using the command-line as well.  Backup can happen directly to .iso files so that you can burn those to CD/DVD the next morning for an even greater peace of mind. Together with Keep (or rsync for the fans) you can make sure your system is safe from harm.

Until next time,

Matt

1 Comment »

September 7th 2007

Back to basics

A few days ago someone made the comment that Pentaho Data Integration (Kettle) was a bit too hard to use. The person on the chat was someone that tried to load a text file into a database table and he was having a hard time doing just that.

So let’s go back to basics in this blog post and load a delimited text file into a MySQL table.

If you want to see how it’s done, click on this link to watch a real-time (non-edited) flash movie. It’s 11MB to download and is about 2-3 minutes long.

Load customers flash demo

Until next time!

Matt

2 Comments »

Next »

Pentaho world image