1

I have the following problem with PostgreSQL 9.3.

There is a view encapsulating a non-trivial query to some resources (e.g., documents). Let's illustrate it as simple as

CREATE VIEW vw_resources AS
  SELECT * FROM documents; -- there are several joined tables in fact...

The client application uses the view usually with some WHERE conditions on several fields, and might also use paging of the results, so OFFSET and LIMIT may also be applied.

Now, on top of the actual resource list computed by vw_resources, I only want to display resources which the current user is allowed for. There is quite a complex set of rules regarding privileges (they depend on several attributes of the resources in question, explicit ACLs, implicit rules based on user roles or relations to other users...) so I wanted to encapsulate all of them in a single function. To prevent repetitive costly queries for each resource, the function takes a list of resource IDs, evaluates the privileges for all of them at once, and returns the set of the requested resource IDs together with the according privileges (read/write is distinguished). It looks roughly like this:

CREATE FUNCTION list_privileges(resource_ids BIGINT[])
  RETURNS TABLE (resource_id BIGINT, privilege TEXT)
AS $function$
BEGIN
  -- the function lists privileges for a user that would get passed in an argument - omitting that for simplicity
  RAISE NOTICE 'list_privileges called'; -- for diagnostic purposes
  -- for illustration, let's simply grant write privileges for any odd resource:
  RETURN QUERY SELECT id, (CASE WHEN id % 2 = 1 THEN 'write' ELSE 'none' END)
               FROM unnest(resource_ids) id;
END;
$function$ LANGUAGE plpgsql STABLE;

The question is how to integrate such a function in the vw_resources view for it to give only resources the user is privileged for (i.e., has 'read' or 'write' privilege).

A trivial solution would use a CTE:

CREATE VIEW vw_resources AS
  WITH base_data AS (
    SELECT * FROM documents
  )
  SELECT base_data.*, priv.privilege
  FROM base_data
       JOIN list_privileges((SELECT array_agg(resource_id) FROM base_data)) AS priv USING (resource_id)
  WHERE privilege IN ('read', 'write');

The problem is that the view itself gives too much rows - some WHERE conditions and OFFSET/LIMIT clauses are only applied to the view itself, like SELECT * FROM vw_resources WHERE id IN (1,2,3) LIMIT 10 (any complex filtering might be requested by the client application). And since PostgreSQL is unable to push the conditions down the CTE, the list_privileges(BIGINT[]) function ends up with evaluating privileges for all resources in the database, which effectively kills the performance.

So I attempted to use a window function which would collect resource IDs from the whole result set and join the list_privileges(BIGINT[]) function in an outer query, like illustrated below, but the list_privileges(BIGINT[]) function ends up being called repetitively for each row (as testified by 'list_privileges called' notices), which kinda ruins the previous effort:

CREATE VIEW vw_resources AS
  SELECT d.*, priv.privilege
  FROM (
    SELECT *, array_agg(resource_id) OVER () AS collected
    FROM documents
  ) AS d
  JOIN list_privileges(d.collected) AS priv USING (resource_id)
  WHERE privilege IN ('read', 'write');

I would resort to forcing clients to give two separate queries, the first taking the vw_resources without privileges applied, the second calling the list_privileges(BIGINT[]) function passing it the list of resource IDs fetched by the first query, and filtering the disallowed resources on the client side. It is quite clumsy for the client, though, and obtaining e.g. the first 20 allowed resources would be practically impossible as limiting the first query simply does not get it - if some resources are filtered out due to privileges then we simply don't have 20 rows in the overall result...

Any help welcome!

P.S. For the sake of completeness, I append a sample documents table:

CREATE TABLE documents (resource_id BIGINT, content TEXT);
INSERT INTO documents VALUES (1,'a'),(2,'b'),(3,'c');
2
  • Postgresql version? select version(); Commented Aug 27, 2014 at 12:10
  • @ClodoaldoNeto: PostgreSQL 9.3 Commented Aug 27, 2014 at 12:15

1 Answer 1

2

If you must use plpgsql then create the function taking no arguments

create function list_privileges()
  returns table (resource_id bigint, privilege text)
as $function$
begin
  raise notice 'list_privileges called'; -- for diagnostic purposes
  return query select 1, case when 1 % 2 = 1 then 'write' else 'none' end
  ;
end;
$function$ language plpgsql stable;

And join it to the other complex query to form the vw_resources view

create view vw_resources as
select *
from
    documents d
    inner join
    list_privileges() using(resource_id)

The filter conditions will be added at query time

select *
from vw_resources
where
    id in (1,2,3)
    and
    privilege in ('read', 'write')

Let the planner do its optimization magic and check the explain output before any "premature optimization".

This is just a conjecture: The function might make it harder or impossible for the planner to optimize.

If plpgsql is not really necessary, and that is very frequent, I would just create a view in instead of the function

create view vw_list_privileges as
select
    1 as resource_id,
    case when 1 % 2 = 1 then 'write' else 'none' end as privilege

And join it the same way to the complex query

create view vw_resources as
select *
from
    documents d
    inner join
    vw_list_privileges using(resource_id)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your answer. From what I read in the manual (can't find it in the moment), the whole result set is stored in memory before returning from the function, and avoiding that looked like a good idea as there are lots of rows. Will try that, though. Regarding making up a view instead of a function, in fact, I currently have a view which manages to implement all the privilege rules, but uses CTEs heavily which are optimization fences and thus ends up with computing privileges for everything. Which is I hoped to be able to optimize using a set-returning function like the illustrated one.
@OndřejBouda If you are reproducing CTE functionality in plpgsql then you will probably end up with the same fencing problem. A no fence alternative to CTE is just to build them as subqueries in the from clause or turn them into views. The CTE to view approach has the advantage to be easy to nest and reuse in the same query.
There's a great truth in your previous comment. It turns out that breaking the CTE-based view to several views helps to solve the problem. Thank you for showing the right path.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.