How to count repeated rows taking into account other rows in the middle

Question

For SQL server 2008 R2 It seems challenging to come up with a single query to the following: Example given columns a & b:

a |   b
-------------
0 |  2000
1 |  2001
1 |  2002
1 |  2003
2 |  2004
3 |  2005
1 |  2006
1 |  2007
4 |  2008
1 |  2009

Goal: Mark rows with repeated column a and give them unique number taking into account other values in between. Result should be in column c. Note the most difficult part here is to populate column c with 2 & 5 & 7.

a |  b   |  c
-------------
0 |  2000 | 1
1 |  2001 | 2
1 |  2002 | 2
1 |  2003 | 2
2 |  2004 | 3
3 |  2005 | 4
1 |  2006 | 5
1 |  2007 | 5
4 |  2008 | 6
1 |  2009 | 7

Instead of saying it as unique number for repeated value in column a, can we formulate the statement as "Sequentially if value in column 'a' changes we have to increment the value in column C"? Also are you looking it as a view or a t-sql ? — SwapnilBhate, Nov 23 '16 at 08:47
Your restatement seems accurate. A single query (a view) would be better. — Michael, Nov 23 '16 at 08:49

ypercubeᵀᴹ · Answer 1 · 2016-11-23T16:56:14.317

This is a gaps-and-islands problem. One (of the many) ways to solve it (this requires 2012+ versions):

WITH 
  t AS
    ( SELECT a, b, x = CASE WHEN a = LAG(a) OVER (ORDER BY b) 
                           THEN NULL ELSE 1 
                       END
      FROM table_name
    )
SELECT a, b, c = COUNT(x) OVER (ORDER BY b) 
FROM t 
ORDER BY b ;

This should work in 2005 and above:

WITH 
  t AS
    ( SELECT a, b, dx = ROW_NUMBER() OVER (ORDER BY b) 
                        - ROW_NUMBER() OVER (PARTITION BY a ORDER BY b) 
      FROM table_name
    ),
  tt AS
    ( SELECT a, b, mb = MIN(b) OVER (PARTITION BY a, dx)
      FROM t 
    )
SELECT a, b, c = DENSE_RANK() OVER (ORDER BY mb)
FROM tt 
ORDER BY b ;

score 1 · Answer 2 · answered Nov 23 '16 at 13:20

Well, a not so elegant way could be (assuming your table is ordered by b and b is unique) :

with CTE
as
    (
    select t1.a, t1.b, isnull(MAX(t2.b), -1) as max_b
    from my_table t1
        left join my_table t2 on t1.b > t2.b and t1.a <> t2.a
    group by t1.a, t1.b
    ),
CTE_2
as
    (
    select max_b, ROW_NUMBER() over(order by max_b) as c
    from (
        select distinct max_b
        from CTE
        ) t
    )


update t
set c = c2.c
from my_table t 
join CTE c1 on t.b = c1.b
left join CTE_2 c2 on c1.max_b = c2.max_b

How to count repeated rows taking into account other rows in the middle

2 Answers2

Linked