3

I have a table that holds status changes per item. The item can go in and out of a status multiple times.

ItemTable (ItemId int, Status varchar(50), DateOfChange date)

I need to pull the date of each change. However, since the statuses can be repeated for each Item, I can't just do the min(DateOfChange) and find the occurrences.

This is in an OLAP environment, and the subquery that's been used to pull this data is KILLING performance.

Is there a way to pull this stuff via partitioning/rownumber functions? Or something else that would behave better in OLAP? (This is in SQL 2008.)

Valkyrie
  • 631
  • 5
  • 14
  • To clarify, you need to pull the first date of each status for an item? – Kevin Jan 11 '14 at 14:01
  • Sorry, power knocked out the office for about 10 hours. :P No, I need each date for each status change. So if it goes into status A 5 times, I need 5 dates. Be nice if it were just the first instance of A, but no such luck. – Valkyrie Jan 12 '14 at 01:21
  • I had an answer lined up but Martin's is fleshed out a little better if that's what you were looking for. What doesn't quite make sense to me is you have a date per status per item in your table, what are you looking for? – Kevin Jan 12 '14 at 01:28
  • 'Cause it's not just those columns; it's a metric ton of data per day, but this one column is all I care about for this crappy subquery. Martin saved my bacon! – Valkyrie Jan 12 '14 at 02:28

1 Answers1

10

This type of requirement comes under the banner of "gaps and islands". A popular approach is

WITH T
     AS (SELECT *,
                DENSE_RANK() OVER (PARTITION BY ItemId ORDER BY DateOfChange) - 
                DENSE_RANK() OVER (PARTITION BY ItemId, Status ORDER BY DateOfChange) AS Grp
         FROM   ItemTable)
SELECT ItemId,
       Status,
       MIN(DateOfChange) AS Start,
       MAX(DateOfChange) AS Finish
FROM   T
GROUP  BY ItemId,
          Status,
          Grp
ORDER  BY Start 
Martin Smith
  • 84,644
  • 15
  • 245
  • 333