So I've built a short SQL script to do this operation, based on the original sample input data:
Order_year, Contact
2015, 123
2014, 123
2013, 123
2011, 123
2010, 123
2015, 234
2014, 234
2011, 234
The script is as follows:
WITH RECURSIVE ranked_order AS(
SELECT order_year, contact, 1 AS counter FROM orders
UNION
SELECT orders.order_year, orders.contact, ranked_order.counter + 1
FROM orders JOIN ranked_order
ON orders.order_year-1 = ranked_order.order_year
WHERE orders.contact = ranked_order.contact
),
max_from_ranked_order AS(
SELECT order_year AS report_year, contact, max(counter) AS consecutive_year
FROM ranked_order
GROUP BY order_year, contact
),
start_and_consecutive_years AS(
SELECT contact, max(consecutive_year) AS consecutive_years,
report_year - consecutive_year AS start_year
FROM max_from_ranked_order
GROUP BY contact, start_year)
SELECT *, start_year + consecutive_years AS end_year
FROM start_and_consecutive_years
ORDER BY contact;
Now, for a little explanation:
Recusive Query
First, you need to find all the occasions of consecutive years, so I built the ranked_order table(maybe a bad name... maybe should have called it counted_order?) to count instances where a given contact ordered in back-to-back years.
Filter Results
So, due to the recursive query, I am left with all counted elements, rather than simply the maximum consecutive counts. So, I built max_from_ranked_order to extract the max counter value per order_year,contact grouping.
A little arithmetic...
This still doesn't quite yield what we need. Since we grouped over order_year,contact, you still end up with entries like
report_year, contact, consectuive_year
2015, 123, 3
2014, 123, 2
2013, 123, 1
2015, 234, 2
2014, 234, 1
etc...
What you'll notice is that you have unique, information which you can group on by observing the starting year of the consecutive years. So, I created start_and_consecutive_years to take the maximum number of consecutive purchasing years for any given starting year, on a per customer basis.
Finally, I get my result by adding the consecutive_years to the start_year, and presto, the final SELECT yields:
contact, consecutive_years, start_year, end_year
123, 3, 2012, 2015
123, 2, 2009, 2011
234, 2, 2013, 2015
234, 1, 2010, 2011
Note that it would be very easy to filter out for casesWHERE consecutive years >= 2 to ignore "irrelevant" results.
Now for the caveats: I wrote this series of CTEs for ease of readability and explanation. There may be a more computationally or programatically efficient way to achieve your results, but on even a moderately sized table this should get you your results pretty quick.