1

I have a table set out like so:

user_id  the_date    minutes_asleep         
----------------------------
1       2015-01-01   480
1       2015-01-02   0
1       2015-01-03   0
1       2015-01-04   0
1       2015-01-05   321
1       2015-01-06   0
1       2015-01-07   0
1       2015-01-08   0
2       2015-01-01   567
2       2015-01-02   0
2       2015-01-03   285
2       2015-01-04   0
2       2015-01-05   577
2       2015-01-06   0
2       2015-01-07   0
2       2015-01-08   0

I need to find the count of groups of 3 consecutive dates where minutes asleep is 0. So in the example above user_id 1 would have a count of 2, user_id 2 would have a count of 1. The total count would be 3. So the returning result would be:

total_count
-----------
3

In the case that there are 5 consecutive dates, the count should only be 1. If there are 6 consecutive, the count should be 2 etc.

A user will only ever have one record per date.

Test table:

CREATE TABLE fitness_data (
    id serial NOT NULL,
    the_date date,
    minutes_asleep integer,
    user_id integer
);

INSERT INTO fitness_data (the_date, minutes_asleep,user_id) VALUES
('2015-01-01',480,1),('2015-01-02',0,1),('2015-01-03',0,1),('2015-01-04',0,1),('2015-01-05',321,1),('2015-01-06',0,1),('2015-01-07',0,1),('2015-01-08',0,1),('2015-01-01',567,2),('2015-01-02',0,2),('2015-01-03',285,2),('2015-01-04',0,2),('2015-01-05',577,2),('2015-01-06',0,2),('2015-01-07',0,2),('2015-01-08',0,2)
Jay
  • 13
  • 3
  • You may want to start with this similar question. – LowlyDBA - John M Jan 30 '15 at 18:53
  • By my reckoning, the answer is only 2 - in the final 3 zeros, two of them occur on the same date, not consecutive ones (2015-01-06). Typo? Also, could you provide DDL (CREATE TABLE (user_id INT...)) and also DML (INSERT INTO TABLE Blah VALUES(....). This makes it far easier for people to help you - plus you're more likely to get good answers quickly. Check out the tour and the "help us to help you blog" (both at bottom of screen). BTW, welcome to the site. – Vérace Jan 30 '15 at 18:53
  • @Vérace Yep typo sorry. – Jay Jan 30 '15 at 18:55
  • @Vérace Create Table, Insert Into added. Thanks for pointing me to the links. – Jay Jan 30 '15 at 19:07
  • 1
    You should have used "serial" instead of "integer" - your SQL fails otherwise. Second (pet peeve coming up!), could you please not use SQL keywords as table or field names - it makes typos/errors difficult to track down, plus if you wish to migrate in the future, you'll be scr**ed. :-) – Vérace Jan 30 '15 at 19:39
  • @Vérace Serial for which column? And yeah, fair point on the field names! – Jay Jan 30 '15 at 19:42
  • id integer NOT NULL, should be id serial NOT NULL, otherwise it's not auto_incrementing in MySQL parlance (or a sequence in Oracle lingo). Try it. – Vérace Jan 30 '15 at 19:46
  • @Vérace Ah, I just output the schema from my Rails app and it gave me that create table back. However, it appears that the incrementing is handled by a created sequence. – Jay Jan 30 '15 at 19:50
  • What would you do with 5 consecutive 0? Count as 1, as 3 or something else? Also, is (user_id, date) unique? (IOW: max 1 entry per user and day?) Please clarify the question. – Erwin Brandstetter Jan 30 '15 at 19:57
  • @ErwinBrandstetter Added some more details – Jay Jan 30 '15 at 20:12

1 Answers1

2

This counts ranges of at least 3 days with minutes_asleep = 0.
A continuous range of 5 days still counts as 1.
A continuous range of 6 days counts as 2. Etc.
Ignoring all other entries where minutes_asleep is different.

SELECT sum(ct) AS total_count
FROM  (
   SELECT (count(*)/3)::int AS ct  -- integer division truncates as desired + see below
   FROM (
      SELECT user_id
           , the_date - (row_number() OVER (PARTITION BY user_id
                                            ORDER BY the_date))::int AS grp
      FROM   fitness_data
      WHERE  minutes_asleep = 0
      ) sub1
   GROUP  BY user_id, grp
   HAVING count(*) >= 3
   ) sub2
UNION ALL  -- see below
SELECT 0
LIMIT 1;

UNION ALL ... is just to return 0 instead of "no row" where no qualifying ranges are found. You could also use COALESCE in another subselect, but I find this more elegant.

Why (count(*)/3)::int? count() returns bigint. sum(bigint) returns numeric. I am pretty positive we don't need to go that far. Casting back to int, so we get a bigint result. Should be faster, too.

Related answer with more explanation:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600