3

Basically I want to assign the result of a query to a custom type attribute. However I noticed that querying directly from the PostgreSQL console was about 0.071 ms and inside the function was 0.400 ms and 0.170 ms after a few calls. explain analyze even shows the usage of indexes in the first case but not on the second.

Here's what I'm doing.

CREATE OR REPLACE FUNCTION fun_isliked(
    par_client client.id%type,
    par_feed feed.id%type
)
RETURNS boolean
AS
$$
BEGIN
    RETURN (
        EXISTS(
            SELECT
                1
            FROM
                feedlike fl
            WHERE
                fl.client = par_client
                AND fl.feed = par_feed
                AND fl.state = 1
        )
    );
END;
$$ LANGUAGE plpgsql STABLE;

Here are the outputs of explain analyze of the two cases described above:

postgres=# explain analyze select exists (select 1 from feedlike where client = 13 and feed = 68 and state = 1);
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=8.30..8.31 rows=1 width=0) (actual time=0.037..0.037 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Index Scan using feedlike_client_feed_unique on feedlike  (cost=0.28..8.30 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)
           Index Cond: ((client = 13) AND (feed = 68))
           Filter: (state = 1)
 Total runtime: 0.086 ms

postgres=# explain analyze select * from fun_isliked(13, 68);
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Function Scan on fun_isliked  (cost=0.25..0.26 rows=1 width=1) (actual time=0.398..0.398 rows=1 loops=1)
 Total runtime: 0.416 ms

What is a possible workaround in order to effectively get same runtime inside the function? Is it even possible? Also, I'm running PostgreSQL 9.3.

I found that this question in SO had what I needed but after I tried everything in the selected answer and had no success on reducing the runtime I decided to ask a new question.

oxfist
  • 361
  • 1
  • 5
  • 14
  • Show the explain analyze result(s), please. – Craig Ringer Feb 06 '14 at 12:44
  • Also, what's your operating system? Does anything (at all) else run on the same machine? Those kinds of times are not far from normal scheduler jitter and measurement error, you really need to aggregate them into larger loop runs that take multiple seconds to usefully compare them. – Craig Ringer Feb 06 '14 at 12:54
  • I'm running Mac OSX and the production server runs ArchLinux. However, I dont't see why that matters, since running the same query in the same machine but in two different contexts yields two pretty different runtimes. – oxfist Feb 06 '14 at 13:12
  • 1
    Why are you using language "plpgsql" when you could be using language "sql"? This should have lower execution overhead. – Colin 't Hart Feb 06 '14 at 13:15
  • You're right, that was one of the things I tried, like was said in the SO link I gave. However this had no effect on the runtime. – oxfist Feb 06 '14 at 13:16
  • Can you do an explain analyze inside the function? I wonder if the query is being executed differently in that context. – Colin 't Hart Feb 06 '14 at 13:20
  • Surprisingly, running explain analyze on the query inside the function yields the same runtime as running it directly in the console, i.e. ~0.70 ms. – oxfist Feb 06 '14 at 13:33
  • I even modified the function so I would return the query with EXECUTE but the runtime didn't drop as I expected. – oxfist Feb 06 '14 at 13:39
  • And if you explain analyze the function containing the explain analyze, do you still get a lot of overhead? – Colin 't Hart Feb 06 '14 at 13:58

1 Answers1

5

The elephant in the room is function overhead. When calling a function instead of raw SQL, Postgres needs to look up the function in the system catalogs (possibly picking the best match in case of function overloading) and consider settings of the function. And a tiny overhead for the function call itself.

That only really matters for very simple queries like your example - which should much rather be implemented with a plain SQL function. You still encounter some (tiny) function overhead with an SQL function, but it can be "inlined" if it's a simple SELECT and some preconditions are met.

Like always, repeated calls profit from populated cache. And in the case of PL/pgSQL functions, it may also start to use a generic plan after some (typically 5) calls, if that seems promising.

Consider this related discussion on pgsql-general.

Query plans for code inside PL/pgSQL functions

You wonder:

explain analyze even shows the usage of indexes in the first case but not on the second [plpgsql function].

PL/pgSQL functions are black boxes to the query planner, which does not examine the contents of the function. They act as optimization barriers. Unlike with SQL functions their content cannot be inlined in outer queries.

Accordingly, EXPLAIN ANALYZE doesn't show what's happening inside the PL/pgSQL function, so we see no index scan. The index is still used (most probably). You can use the additional module auto_explain to have a peek at query plans for SQL statements inside the function. See:

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
  • What exactly do you mean by inlined? – oxfist Feb 06 '14 at 14:44
  • 1
    300+ms for the first call and 100ms per function execution with a warm cache seems too expensive an overhead to me. I wonder if there's some room for optimisation here? – Colin 't Hart Feb 06 '14 at 14:54
  • 1
    @Oxfist: Funcion inlining means that (simple) sql queries can be taken from an sql function to replace the function call in the context of a bigger query, thereby eliminating the function overhead for repeated calls. – Erwin Brandstetter Feb 06 '14 at 14:57
  • That's why I suggested language "SQL" rather than "plpgsql" as comment on the question. – Colin 't Hart Feb 06 '14 at 14:59
  • @Colin'tHart: +1 to that. I may be mistaken but it seems to me that indexes aren't being used at all inside the function. – oxfist Feb 06 '14 at 15:00
  • @ErwinBrandstetter: Oh, I see. Did that to check if doing a SELECT ... INTO was faster, but I got roughly the same runtime. – oxfist Feb 06 '14 at 15:01
  • @Oxfist: As for why you don't see the index scan in the EXPLAIN output - I added information on how to get query plans for the code inside plpgsql functions to my answer. – Erwin Brandstetter Feb 06 '14 at 15:24
  • @Colin'tHart: I agree that the overhead of the example seems a bit much. I recreated a similar scenario and had about 20 ms overhead with warm cache. There are many factors possibly contributing to that ... – Erwin Brandstetter Feb 06 '14 at 15:31
  • I'm assuming the additional runtime is because of function overhead, 'cause even though I tried several alternative ways the runtime doesn't drop any further. So I'm marking this as the answer. – oxfist Feb 06 '14 at 16:51