7 comments

  • Pingback: Twitter Trackbacks for Matt Casters on Data Integration » Parse nasty XLS with dynamic ETL [ibridge.be] on Topsy.com

  • Pingback: Write ETL that writes ETL – Creating Crosstabs with Kettle | Adventures with Open Source BI

  • Nathaniel

    This is excellent, I’m trying it on a slightly different soruce spreadsheet – it’s identical expect that it has a 1 cell border of nothing.

    So, all the data/headings start on column 1 (0 index). No matter what I do, I can’t get this to start at Row 1 and Column 1. It always comes out blank.

    Any hints?

  • In the Sheets tab you can simply leave the sheet name blank and put in 1/1 for start column and row. That will make all sheets start at 1/1.

  • Stefan

    Excellent – was looking for this and here it’s all described in great detail! Matt, thanks a lot.

    No provisions are made that this runs as a transaction, most obviously on file layout.txt, but also ETL Metadata Injection does not seem to be safe if multipe instances run in parallel.

    Not that I’d need that – at least not now. But it makes me wonder whether this is a general concern for ETL or whether people generally assume that a ‘once-a-day’ execution is safe enough…?

    Stefan

  • I don’t see why it wouldn’t support database transactional transformations as templates or why it wouldn’t run in parallel. If you there are specific desires or bugs, feel free to file improvement ideas in JIRA (http://jira.pentaho.com).
    Thanks in advance,

    Matt

  • Bolaka

    Indeed, a very useful step. I need it for the Knowledge Flow plug-in though. Need to model different sets of inputs in a job using the knowledge flow plug-in. So need to inject the meta data into the Fields/Sampling tab dynamically.

    Any help on this would be greatly appreciated.