I see this pattern all the time: you need aggregated data from one table, but from multiple different ranges. A very common example of this is a table that holds some kind of date-related data, and a report that displays current month and year to date. For the purposes of this article, let's create a simple, denormalized table that holds accounting data:
create table Account ( AccountNum int ) create table AccountActivity ( AccountNum int, Date datetime, Amount money )
To generate the kind of results I described above, I often see queries that look something like this:
create procedure ActivityReport @MonthStart datetime -- Assume the day part is 1 for simplicity as select AccountNum, isnull( ca.Amount, 0 ) as AmountCurrent, isnull( ya.Amount, 0 ) as AmountYTD from Account a left join ( select AccountNum, sum( Amount ) as Amount from AccountActivity where Date >= @MonthStart and Date < dateadd( m, 1, @MonthStart ) group by AccountNum ) ca on a.AccountNum = ca.AccountNum left join ( select AccountNum, sum( Amount ) as Amount from AccountActivity where Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart ) and Date < dateadd( m, 1, @MonthStart ) group by AccountNum ) ya on a.AccountNum = ya.AccountNum
In plain English, there is a subquery for each date range. This results in a table access for each subquery, which is redundant and slow. Every query that follows this pattern can be rewritten by moving the date ranges in the WHERE clauses to CASE expressions in a single SELECT that encompasses the entire date range. In the above example, the earliest date desired is the beginning of year, and the latest date desired is the first of the month after @MonthStart, so the new replacement subquery will encompass that entire range:
select * from AccountActivity where Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart ) and Date < dateadd( m, 1, @MonthStart )
This query will return all the data necessary to compute the current month's activity as well as the year to date activity, but since it returns all the data in a single derived table, the data will have to be filtered in order to produce the desired results. This can be accomplished by using CASE expressions in the SELECT clause, like this (important parts are in italic):
create procedure ActivityReport @MonthStart datetime -- Assume the day part is 1 for simplicity as select AccountNum, sum( isnull( case when aa.Date >= @MonthStart and aa.Date < dateadd( m, 1, @MonthStart ) then aa.Amount else 0 end, 0 ) ) as AmountCurrent, sum( isnull( case when aa.Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart) and aa.Date < dateadd( m, 1, @MonthStart ) then aa.Amount else 0 end, 0 ) ) as AmountYTD from Account a left join AccountActivity aa on a.AccountNum = aa.AccountNum where Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart ) and Date < dateadd( m, 1, @MonthStart ) group by a.AccountNum
Notice how the conditions that were in the WHERE clauses above move directly into the CASE expressions. It's ugly, but it will also run much faster thanks to reduced table access. In many cases, you can reduce the redundancy and complexity of the CASE expressions by taking the WHERE conditions into consideration as well. The simplest example of this is the CASE expression for the YTDAmount; it is 100% redundant with the conditions of the WHERE clause, so the query can be simplified by removing the redundant expressions:
create procedure ActivityReport @MonthStart datetime -- Assume the day part is 1 for simplicity as select AccountNum, sum( isnull( case when aa.Date >= @MonthStart and aa.Date < dateadd( m, 1, @MonthStart ) then aa.Amount else 0 end, 0 ) ) as AmountCurrent, sum( isnull( aa.Amount, 0 ) ) as AmountYTD from Account a left join AccountActivity aa on a.AccountNum = aa.AccountNum where Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart ) and Date < dateadd( m, 1, @MonthStart ) group by a.AccountNum
The other CASE statement has one condition redundant with the WHERE clause: the upper bound
dateadd( m, 1, @MonthStart )
So it can be removed as well:
create procedure ActivityReport @MonthStart datetime -- Assume the day part is 1 for simplicity as select AccountNum, sum( isnull( case when aa.Date >= @MonthStart then aa.Amount else 0 end, 0 ) ) as AmountCurrent, sum( isnull( aa.Amount, 0 ) ) as AmountYTD from Account a left join AccountActivity aa on a.AccountNum = aa.AccountNum where Date >= dateadd( m, 1 - month( @MonthStart ), @MonthStart ) and Date < dateadd( m, 1, @MonthStart ) group by a.AccountNum
Now, if you needed to add the amount of the prior month or next month to this query, all you would need is another CASE expression in the SELECT clause instead of a whole new subquery. This approach performs better right away, and also scales much better if more date ranges are added.
Remember Me
a@href@title, i, strike, u
Copyright © 2003-2008 Falafel Software Inc.
Subscribe to Falafel Blogs
The opinions expressed herein are Falafel's employees own personal opinions and do not represent Falafel Software's view in any way in case they go bananas!