Why is count(x.) slower than count()?

Question

explain select count(x.*) from customer x;
...
->  Partial Aggregate  (cost=27005.45..27005.46 rows=1 width=8)
  ->  Parallel Seq Scan on customer x  (cost=0.00..26412.56 rows=237156 width=994)

explain select count(*) from customer x;
...
->  Partial Aggregate  (cost=27005.45..27005.46 rows=1 width=8)
  ->  Parallel Seq Scan on customer x  (cost=0.00..26412.56 rows=237156 width=0)

The COUNT(x.*) here makes the width in the explain result read unnecessary row data.

I thought they should be identical, but it seems not, why?

Because Postgres doesn't optimize the expression x.* away and checks for each and every row whether it's null or not: See e.g. here: https://blog.jooq.org/2019/09/19/whats-faster-count-or-count1/ — , Mar 18 '22 at 10:11
@a_horse_with_no_name Is it too hard to optimize such query or they just have not do it ? — jilen, Mar 18 '22 at 11:39
It seems to be a matter of "nobody cared": https://www.postgresql.org/message-id/flat/df354b76-a645-73fc-9645-4508e5291d71%40gmx.net — , Mar 18 '22 at 11:46
@a_horse_with_no_name you should post your comment as an answer, because that's what it is. — mustaccio, Mar 18 '22 at 12:44

Erwin Brandstetter · Accepted Answer · 2022-03-18T19:39:13.650

Logically, both are identical - because x.* always counts, even when all columns are NULL.
But Postgres has a separate implementation for count(*).

It does not bother with any expression at all and only considers the existence of live rows. That's slightly faster, which sums up to a relevant difference over many rows.
The performance penalty for count(x.*) grows with the number of columns / width of rows, and will be rather substantial for wide rows like yours (width=994).

It's even documented explicitly:

count ( * ) → bigint

Computes the number of input rows.

count ( "any" ) → bigint

Computes the number of input rows in which the input value is not null.

The gist of it: whenever you don't care whether an expression is NULL, use count(*) instead.

For absolute performance, is SUM faster or COUNT?

Some other RDBMS do not have the same fast path for count(*). OTOH, counting all rows in a table is comparatively slow in Postgres due to its MVCC model that forces checking row visibility. See:

Fast way to discover the row count of a table in PostgreSQL

Why is count(x.*) slower than count(*)?

1 Answers1

Why is count(x.) slower than count()?