May 4th 2007 02:51 am
Serving SQL
How do you like your SQL served? Personally I like this type:
SELECT dc.state
, dd.year
, dd.month
, sum(fs.sales) as total_sales
FROM fact_sales fs,
dim_date dd,
dim_customer dc
WHERE fs.customer_tk = dc.customer_tk
AND fs.date_tk = dd.date_tk
AND dd.year IN ( 2006, 2007 )
GROUP BY dc.state
, dd.year
, dd.month
Obviously, it’s inspired by simple star schema querying. Anything goes for me, but I kinda like a little bit of “structure” in my SQL (no pun intended), just a little bit of overview.
What’s your favorite SQL format?
Matt
P.S. We will soon be launching PDI version 2.5.0, stay tuned!
8 Comments »

Rick on 04 May 2007 at 19:30 #
Hmm, I looked at your preferred method for a second and thought, “that’s kinda quirky… commas on the next line.” but it looks convenient for things like code merges, editing the query in place (don’t have to go through the “forgot to remove the trailing comma” scenario, etc..
The one thing I see as inconsistent in your example is your FROM clause. Why do you keep the commas on the same line there?
Matt Casters on 04 May 2007 at 21:58 #
It’s a good point. I guess I thought the FROM clause didn’t change as much as the fields in the SELECT and GROUP BY clauses.
Previously I didn’t care too much about the commas either. That is to say, until I wrote a SQL generator in Pentaho Metadata that almost by accident generated this style of SQL. I found it very easy to change the generated SQL afterwards to test and validate.
That’s how it grew on me
Samatar on 04 May 2007 at 22:16 #
Personnaly, i like the comma in the next line.
I try also (if i have to make JOIN) to separate JOIN conditions and “pure” WHERE conditions when i can.
SELECT
first_name
, middle_name
, last_name
FROM person p
INNER JOIN company c
ON p.com_code=c.com_code
WHERE middle_name IS NULL
Not sure it works on most of db
Samatar
Samatar on 04 May 2007 at 22:16 #
Nice the new color.
Chris Janton on 05 May 2007 at 22:36 #
I don’t place names on the same line with any of the SQL keywords,
so
FROM
xyz
, abc
, def
on WHERE conditions I indent AND/OR just like commas
WHERE
condition 1
AND condition 2
AND condition 3
OR whatever
I normally place the join criteria first
a.x = b.x
, b.x = c.x
but have been known to place all of the conditions for a given table “in order”, so join conditions are interspersed with selection conditions.
Rdb Notes · Matt Casters on Data Integration - Serving SQL on 05 May 2007 at 22:37 #
[…] Matt Casters on Data Integration » Serving SQL How do you like your SQL served? Personally I like this type: […]
Matt Casters on 06 May 2007 at 12:22 #
Re: Samatar: I don’t really like this “INNER JOIN … ON” format. I’ve seen it on plenty of occasions and I never quite got to like it. It seems that the very thing you are trying to accomplish (== only specify the limiting factors in the WHERE clause, not the joins) bites you in the end. When the query gets more complex and more tables are involved and the query becomes complex, you risk having a lot of these WHERE lines mixed together with the other clauses.
That being said, for queries on relational databases, I do like it for the simple reason that you never know in advance how many other tables you have to join with.
Chris: Just tried it and I think I would have to agree with you on this one. Thanks for the tip.
I’m sure all these things make it harder to write SQL, but it sure makes it a lot easier to read and maintain it afterwards
Shiva on 09 Jun 2008 at 15:01 #
Well i have gonne thru your blog really appreciate the content you have written
but you missed some points like
1. In case the transformation aborts in between how to raise an event and inform that the transsformation has stopped
2 . In case of the Lock wait timeout exceeded how to handle it