What is the correct result for this query?

Question

I came across this puzzle in the comments here

CREATE TABLE r (b INT);

SELECT 1 FROM r HAVING 1=1;

SQL Server and PostgreSQL return 1 row.

MySQL and Oracle return zero rows.

Which is correct? Or are both equally valid?

Nice puzzle. I think the correct is to return 1 row. SQL-Server is contradicting itself though because SELECT COUNT(*) FROM r; returns 1 row (with 0), while SELECT COUNT(*) FROM r GROUP BY (); returns no rows. — ypercubeᵀᴹ, Jan 29 '13 at 17:35
Want more? SELECT 1 WHERE 1=0 HAVING 1=1;. SQL Server and PostgreSQL still return one row. Oracle wants FROM DUAL and returns no rows. MySQL doesn't compile neither with FROM DUAL nor without it. — Andriy M, Jan 29 '13 at 19:04
@ypercube: I'm still puzzled as to the reasons for this difference. But to be honest, I've already used GROUP BY () as a way of adding the grand total only when there are details. Omitting GROUP BY would yield an extra row. Or I would have to use an IF (or a WHERE EXISTS, perhaps). — Andriy M, Jan 29 '13 at 19:09
@AndriyM For some unknown reason "dual" and "HAVING" do not play well in MySQL. (Nice finding). But the equivalent works: SELECT 1 AS t FROM (SELECT 1) tmp WHERE 1=0 HAVING 1=1; 1-row-no-dual and returns 0 rows. — ypercubeᵀᴹ, Jan 29 '13 at 21:33
@SQLKiwi - What about this passage from the spec. "If TE does not immediately contain a <group by clause>, then “GROUP BY ()” is implicit.". Shouldn't both queries return the same results then? — Martin Smith, Jan 30 '13 at 11:13
@SQLKiwi - Ah OK thanks! (embarrassed to admit I googled Dr Emmett Brown expecting him to be some eminent database authority that I had never heard of but the penny has now dropped! :-) — Martin Smith, Jan 30 '13 at 11:27
Oracle and SQL-Server agree on this: SQL-fiddle: with and without GROUP BY () — ypercubeᵀᴹ, Jan 30 '13 at 12:57
But disagree on these (Oracle executes queries with HAVING differently): SQl-fiddle 2: HAVING makes things different — ypercubeᵀᴹ, Jan 30 '13 at 13:02

Kevin Cathcart · Accepted Answer · 2013-01-31T18:24:39.650

Per the standard:

SELECT 1 FROM r HAVING 1=1

means

SELECT 1 FROM r GROUP BY () HAVING 1=1

Citation ISO/IEC 9075-2:2011 7.10 Syntax Rule 1 (Part of the definition of the HAVING clause):

Let HC be the <having clause>. Let TE be the <table expression> that immediately contains HC. If TE does not immediately contain a <group by clause>, then “GROUP BY ()” is implicit. Let T be the descriptor of the table defined by the <group by clause> GBC immediately contained in TE and let R be the result of GBC.

Ok so that much is pretty clear.

Assertion: 1=1 is true search condition. I will provide no citation for this.

Now

SELECT 1 FROM r GROUP BY () HAVING 1=1

is equivlent to

SELECT 1 FROM r GROUP BY ()

Citation ISO/IEC 9075-2:2011 7.10 General Rule 1:

The <search condition> is evaluated for each group of R. The result of the <having clause> is a grouped table of those groups of R for which the result of the <search condition> is True.

Logic: Since the search condition is always true, the result is R, which is the result of the group by expression.

The following is an excerpt from the General Rules of 7.9 (the definition of the GROUP BY CLAUSE)

1) If no <where clause> is specified, then let T be the result of the preceding <from clause>; otherwise, let T be the result of the preceding <where clause>.

2) Case:

a) If there are no grouping columns, then the result of the <group by clause> is the grouped table consisting of T as its only group.

Thus we can conclude that

FROM r GROUP BY ()

results in a grouped table, consisting of one group, with zero rows (since R is empty).

An excerpt from the General Rules of 7.12, which defines a Query Specification (a.k.a a SELECT statement):

1) Case:

a) If T is not a grouped table, then [...]

b) If T is a grouped table, then

Case:

i) If T has 0 (zero) groups, then let TEMP be an empty table.

ii) If T has one or more groups, then each <value expression> is applied to each group of T yielding a table TEMP of M rows, where M is the number of groups in T. The i-th column of TEMP contains the values derived by the evaluation of the i-th <value expression>. [...]

2) Case:

a) If the <set quantifier> DISTINCT is not specified, then the result of the <query specification> is TEMP.

Therefore since the table has one group, it must have one result row.

Thus

SELECT 1 FROM r HAVING 1=1

should return a 1 row result set.

Q.E.D.

+1 Thanks for going to all that trouble! As @ypercube says SQL Server does seem to contradict itself here as SELECT 1 FROM r GROUP BY (); returns zero rows but the passage you quoted seems quite clear on this point. — Martin Smith, Jan 29 '13 at 23:36
May I ask where did you find the standard? If you say 'on my bookshelf' I'll be disappointed :) — András Váczi, Jan 30 '13 at 13:52
Technically I used the Final Draft International Standard, rather than the standard itself. Per ISO/IEC rules only editorial (non-technical) changes are permitted between FDIS and the final standard. The standard is spit up into multiple parts. Part 1, Part 2, Part 4 ... — Kevin Cathcart, Jan 30 '13 at 17:53
Part 11, and Part 14. Parts 3,9,10,and 13 Were not updated in 2011, and thus their previous versions apply. There is no part 12. Similarly there are no parts 5-8. See the Wikipedia page for Sql:2011 or Part 1 itself for an explanation of what each part contains. — Kevin Cathcart, Jan 30 '13 at 17:59

ypercubeᵀᴹ · Answer 2 · 2013-01-29T18:25:32.027

7

When there is a HAVING clause, without a WHERE clause:

SELECT 1 FROM r HAVING 1=1;

... then GROUP BY () is implicit. So, the query should be equivalent to:

SELECT 1 FROM r GROUP BY () HAVING 1=1;

... which should group all rows of the table into one group (even if the table has no rows at all - it's still one group of 0 rows) and return 1 row. The HAVING with the True condition should have no effect at all after that.

From a different angle, how many rows should a query like this return?

SELECT COUNT(*), MAX(b) FROM r;

One, zero or "zero or one, depending on if the table is empty or not"?

I think one row, no matter how many rows r has.

edited Jan 29 '13 at 18:25

answered Jan 29 '13 at 17:55

ypercubeᵀᴹ

97,895
13
214
305

Well the key issue is whether it's indeed true that "even if the table has no rows at all, it's still one group of 0 rows". And the standard turns out to be explicit about this : "If there are no grouping columns, then ... is the grouped table consisting of T as its only group". (and that holds even if T is empty - so there is indeed a group.) Further on, the having clause specifies that the condition is applied to each group (in the example thus once). They probably defined it this way to make SUM and COUNT return one row even for empty T's. – Erwin Smout Jan 29 '13 at 20:29
+1 (earlier!) Even though your logic is the same as Kevin's I've accepted his answer because of the quotations from the spec though. Thanks! – Martin Smith Jan 29 '13 at 23:43
@MartinSmith. Thnx. That I get from being lazy :) – ypercubeᵀᴹ Jan 30 '13 at 07:54
@ypercube: +1 from me too. I decided to take the extra time to pull from the spec to prove that there was no weasel words hidden someplace that would make your answer wrong. But once I did that, I might as well post it as a full answer. So I did. – Kevin Cathcart Jan 30 '13 at 23:07
@Kevin: Well done. I hadn't spotted the "... where M is the cardinality of T ..." part yesterday. – ypercubeᵀᴹ Jan 30 '13 at 23:10
@ypercube actually the phrase that mentions cardinality is for non-grouped rows. You want 1.b.ii which says "where M is the number of groups in T." I've trimmed down 1.b.i to hide irrelevant detail. – Kevin Cathcart Jan 31 '13 at 18:26
3

@ErwinSmout: Of course not. However this falls within fair use under US copyright law. Relatively small portions, quoted in the context of analysis (i.e. criticism) of the work, for educational purposes, with negligible impact on the work's ability to be sold. – Kevin Cathcart Feb 01 '13 at 16:31

score 3 · Answer 3 · answered Jan 29 '13 at 17:45

3

From what I see, it looks like SQLServer and PostgerSQL don't bother looking into table at all:

CREATE TABLE r (b INT);
insert into r(b) values (1);
insert into r(b) values (2);
SELECT 1 FROM r HAVING 1=1;

also returns just one row. Even though SQLServer docs says

When GROUP BY is not used, HAVING behaves like a WHERE clause.

that is not true in this case - WHERE 1=1 instead of HAVING returns proper number of rows. I'd say it's optimizer bug (or at least documentation bug)... SQLServer plan shows 'Constant scan' in case of HAVING and 'table scan' for WHERE...

Oracle and Mysql behaviour seems more logical and correct to me...

answered Jan 29 '13 at 17:45

a1ex07

9,000
3
24
40

1

You're right that SQL Server doesn't look at the table. The execution plan just has a constant scan and doesn't even reference the table. If it was only SQL Server I would have just put it down to a bug but as it isn't just SQL Server I'm wondering if there is some genuine ambiguity here. – Martin Smith Jan 29 '13 at 17:50
PostgreSQL shows the same results as SQLServer, and as far as I can tell from output of explain "Result( rows=1 )..." for having and "Seq Scan " for "WHERE" it also doesn't look into the table... I guess it's somehow related to the fact that "FROM" is not mandatory in TSQL and PostgreSQL. I know Mysql also doesn't require it, but since they support dual, they probably parse the query a bit different. I agree, it sounds like a speculation, but I hope it makes some sense. – a1ex07 Jan 29 '13 at 18:06

What is the correct result for this query?

3 Answers3

Linked