1

I have 2 tables:

  1. Organization (2.500.000 rows)
  2. City (35.000 rows)

'Organization' has 'city_id' set as foreign key to 'cities'.'id':

Organization:

ALTER TABLE public.organization
  ADD CONSTRAINT organization_pkey1 PRIMARY KEY(id);

ALTER TABLE public.organization
  ADD CONSTRAINT organization_city_id_381976b6f7d84e86_fk_city_id FOREIGN KEY (city_id)
      REFERENCES public.city (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED;

CREATE INDEX organization_c7141997
  ON public.organization
  USING btree
  (city_id);

City:

ALTER TABLE public.city
  ADD CONSTRAINT city_pkey1 PRIMARY KEY(id);

Query below takes 6.2 seconds to finish:

SELECT 
    org.id
FROM organization AS org
INNER JOIN city ON city.id = org.city_id 
WHERE org.id>=1 AND org.id<=3736809
LIMIT 1000 OFFSET 100000;

Explain analyze result:

Limit  (cost=22129.97..22340.23 rows=1000 width=4) (actual time=6268.343..6274.803 rows=1000 loops=1)
  ->  Hash Join  (cost=1104.52..521879.44 rows=2476878 width=4) (actual time=29.056..6259.761 rows=101000 loops=1)
        Hash Cond: (org.city_id = city.id)
        ->  Seq Scan on organization org  (cost=0.00..465045.17 rows=2476878 width=8) (actual time=0.014..6026.781 rows=101000 loops=1)
              Filter: ((id >= 1) AND (id <= 3736809))
        ->  Hash  (cost=694.23..694.23 rows=32823 width=4) (actual time=29.014..29.014 rows=32823 loops=1)
              Buckets: 4096  Batches: 1  Memory Usage: 1154kB
              ->  Seq Scan on city  (cost=0.00..694.23 rows=32823 width=4) (actual time=0.008..13.538 rows=32823 loops=1)
Total runtime: 6275.272 ms

Is this an expected query performance ?

How to speed query execution ?

p.s. - i include some of my settings below

PG 9.3 on MacOS 10.9.5 (8GB ram)

Postgresql.conf:
default_statistics_target = 50
maintenance_work_mem = 448MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 5632MB
work_mem = 44MB 
wal_buffers = 8MB 
checkpoint_segments = 16 
shared_buffers = 1792MB
max_connections = 80
Termos
  • 113
  • 2
  • what are you trying to achieve? What's the point of doing offset without ordering? – Mladen Uzelac Nov 21 '14 at 16:42
  • Mladen, i am indexing database for use with Sphinx search. – Termos Nov 21 '14 at 17:13
  • Did you try PostgreSQL full text search instead of Sphinx? I didn't ask about indexing, I asked about getting data doing limit and offset. Check my answer for more efficient data retrieval. – Mladen Uzelac Nov 21 '14 at 17:21
  • Yes i did :) Sphinx is way faster at some operations. I am still checking your answer below. – Termos Nov 21 '14 at 17:24
  • Using materialized view will definitely speed up your query. :D I hope that you are aware of costs using Materialized views. – Mladen Uzelac Nov 21 '14 at 17:30
  • See the costs of your query: http://explain.depesz.com/s/2yxw . It takes 6 sec to scan your organization table. Do you know the size of your tables? \d+ in psql console – Mladen Uzelac Nov 21 '14 at 18:16

1 Answers1

0

You have a lot of results that's why your index is not used. Make your query more restrictive.

It seems that you are using pagination: http://use-the-index-luke.com/sql/partial-results/fetch-next-page

Mladen Uzelac
  • 829
  • 5
  • 8