As an aside, this is a short-hand. You never have to write this like this, and it's always a bad idea, imho, because it's correlated with the parent query (excessively complex for SELECT), bloats the SELECT-clause and violates the basic structure of the query, and is often likely to be slower.
select company_code,
founder,
(select count(distinct lead_manager_code)
from Lead_Manager
where company_code = c.company_code
)
from Company c
order by company_code;
Is better written with an LEFT OUTER JOIN
SELECT company_code,
founder,
t.count
FROM Company c
LEFT OUTER JOIN (
select company_code, count(distinct lead_manager_code)
FROM lead_manager
GROUP BY company_code
) AS t
USING (company_code)
ORDER BY company_code;
Speed
Analysis with PostgreSQL should be applicable on most databases.
Multirow
The VIRTUAL COLUMN method is likely to result in extra seq scans as the planner won't know to bundle them up. Worth comparing speeds.. 145 ms for the VIRTUAL COLUMN method
EXPLAIN ANALYZE
SELECT *, (
SELECT count(distinct x)
FROM foobar AS f2
WHERE f1.x = f2.x
)
FROM foobar AS f1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Seq Scan on foobar f1 (cost=0.00..17527.50 rows=1000 width=4) (actual time=0.252..145.119 rows=1000 loops=1)
SubPlan 1
-> Aggregate (cost=17.50..17.51 rows=1 width=4) (actual time=0.144..0.144 rows=1 loops=1000)
-> Seq Scan on foobar f2 (cost=0.00..17.50 rows=1 width=4) (actual time=0.065..0.140 rows=1 loops=1000)
Filter: (f1.x = x)
Rows Removed by Filter: 999
Planning time: 0.108 ms
Execution time: 145.268 ms
Yet 2 ms for the LEFT OUTER JOIN method
EXPLAIN ANALYZE
SELECT x, f2.count
FROM foobar
LEFT OUTER JOIN (
SELECT x, count(*)
FROM foobar
GROUP BY x
) AS f2 USING (x);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Hash Right Join (cost=47.50..81.25 rows=1000 width=12) (actual time=1.246..2.154 rows=1000 loops=1)
Hash Cond: (foobar_1.x = foobar.x)
-> HashAggregate (cost=20.00..30.00 rows=1000 width=4) (actual time=0.750..1.100 rows=1000 loops=1)
Group Key: foobar_1.x
-> Seq Scan on foobar foobar_1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.009..0.155 rows=1000 loops=1)
-> Hash (cost=15.00..15.00 rows=1000 width=4) (actual time=0.484..0.484 rows=1000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 44kB
-> Seq Scan on foobar (cost=0.00..15.00 rows=1000 width=4) (actual time=0.015..0.185 rows=1000 loops=1)
Planning time: 0.262 ms
Execution time: 2.294 ms
table foobar created with
CREATE TABLE foobar AS SELECT x FROM generate_series(1,1000) AS x;
Single-row
Even with the single-row case, with PostgreSQL 9.5 the left outer join is faster. I just mean to say it doesn't have to be that way.
EXPLAIN ANALYZE
SELECT *, (
SELECT count(distinct x)
FROM foobar AS f2
WHERE f1.x = f2.x
)
FROM foobar AS f1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on foobar f1 (cost=0.00..2.04 rows=1 width=4) (actual time=0.043..0.044 rows=1 loops=1)
SubPlan 1
-> Aggregate (cost=1.01..1.02 rows=1 width=4) (actual time=0.025..0.026 rows=1 loops=1)
-> Seq Scan on foobar f2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=1)
Filter: (f1.x = x)
Planning time: 0.114 ms
Execution time: 0.097 ms
(7 rows)
EXPLAIN ANALYZE
SELECT x, f2.count
FROM foobar
LEFT OUTER JOIN (
SELECT x, count(*)
FROM foobar
GROUP BY x
) AS f2 USING (x);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=1.01..2.06 rows=1 width=12) (actual time=0.026..0.027 rows=1 loops=1)
Join Filter: (foobar.x = foobar_1.x)
-> Seq Scan on foobar (cost=0.00..1.01 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)
-> HashAggregate (cost=1.01..1.02 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)
Group Key: foobar_1.x
-> Seq Scan on foobar foobar_1 (cost=0.00..1.01 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=1)
Planning time: 0.168 ms
Execution time: 0.088 ms
(8 rows)
table foobar created with
CREATE TABLE foobar AS SELECT x FROM generate_series(1,1) AS x;
Conclusion
It's just too easy to mess this up. It's always slower in my experience with PostgreSQL, but that doesn't always have to be the case with other databases.