8

The premise

In documentation, Row Level Security seems great. Based on what I've read I can now stop creating views like this:

SELECT data.*
FROM data
JOIN user_data
ON data.id = user_data.data_id
AND user_data.role = CURRENT_ROLE

The great part is, Postgres has a great analysis for that view starting with an index scan then a hash join on the user_data table, exactly what we want to happen because it's crazy fast. Compare that with a my RLS implementation:

CREATE POLICY data_owner
ON data
FOR ALL
TO user
USING (
  (
    SELECT TRUE AS BOOL FROM (
      SELECT data_id FROM user_data WHERE user_role = CURRENT_USER
    ) AS user_data WHERE user_data.data_id = data.id
  ) = true
)
WITH CHECK (TRUE);

This bummer of a policy executes the condition for each row in the data table, instead of optimizing by scoping the query to the rows which our CURRENT_USER has access to, like our view does. To be clear, that means select * from data hits every row in the data table.

The question

How do I write a policy with an inner select which doesn't test said select on every row in the target table. Said another way: how do I get RLS to run my policy on the target table before running the actual query on the result?

p.s. I've left this question someone vague and fiddle-less, mostly because sqlfiddle hasn't hit 9.5 yet. Let me know if I need to add more color or some gists to get my question across.

1
  • 2
    I stopped using SQLFiddle because it essentially never works for me. For Postgres I use: rextester.com/l/postgresql_online_compiler Commented Mar 3, 2017 at 7:29

1 Answer 1

7

PostgreSQL may be able to generate a better plan if you phrase the policy like this:

...
USING (EXISTS
          (SELECT data_id
           FROM user_data
           WHERE user_data.data_id = data.id
             AND role = current_user
          )
      )

You should have a (PRIMARY KEY?) index ON user_data (role, data_id) to speed up nested loop joins.

But I think that it would be a better design to include the permission information in the data table itself, perhaps using the name[] type:

CREATE TABLE data(
   id integer PRIMARY KEY,
   val text,
   acl name[] NOT NULL
);

INSERT INTO data VALUES (1, 'one',   ARRAY[name 'laurenz', name 'advpg']);
INSERT INTO data VALUES (2, 'two',   ARRAY[name 'advpg']);
INSERT INTO data VALUES (3, 'three', ARRAY[name 'laurenz']);

Then you can use a policy like this:

CREATE POLICY data_owner ON data FOR ALL TO PUBLIC
   USING (acl @> ARRAY[current_user::name])
   WITH CHECK (TRUE);
ALTER TABLE data ENABLE ROW LEVEL SECURITY;
ALTER TABLE data FORCE ROW LEVEL SECURITY;

When I SELECT, I get only the rows for which I have permission:

SELECT id, val FROM data;
 id |  val
----+-------
  1 | one
  3 | three
(2 rows)

You can define a GIN index to support that condition:

CREATE INDEX ON data USING gin (acl _name_ops);
Sign up to request clarification or add additional context in comments.

2 Comments

Is it appropriate to use name[] here? The Postgres docs indicate that designers should not use name[], as it is only for system catalogues. From the docs: "There are two other fixed-length character types in PostgreSQL, shown in Table 8-5. The name type exists only for the storage of identifiers in the internal system catalogs and is not intended for use by the general user." Link here: postgresql.org/docs/9.0/static/datatype-character.html
name should be fine, as only PostgreSQL identifiers are stored. But it is just as well to use text.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.