2

Postgres has a great RETURNING clause for INSERT, DELETE and UPDATE...and it's made me a bit greedy. In a few cases, what I'd like to get is not only the current value, but the previous value:

UPDATE analytic_productivity

  SET points = 1000
WHERE points > 1000

RETURNING id, points, OLD.points;

I don't believe there's any way to access previous values outside of the lifespan and context of a trigger. So, I'll guess what I'd like isn't possible as such. If that's right, can anyone suggest an alternative? I'm overwriting outliers with some set values, and would like to record the modified values in another table. This is why I don't know the current value in advance. This is a rare (and clearly suspect) operation, and I don't want to record the change on normal inserts and updates.

As an alternative, I'm thinking that I can select the outliers, revise them, and then write back the modifications. So, do most of the work on the client side with a couple of requests to Postgres. If so, can someone suggest the right locking level to apply between my initial SELECT and my following UPDATE? I believe that the FOR UPDATE lock is right.

Any suggestions on a smart way to capture previous values, during an update, without a trigger would be great to hear about.

Follow-up

Thanks to comments here, I experimented a bit and came up with a solution that works in my case. To make my objectives clearer:

  • I've got a table named outlier_rule that defines values that are too high for a specific column.
  • The goal is to loop over the table, and apply the rules to set outliers to a fixed value.
  • Stomping on outliers like this is...questionable. There must be leaks in the app's UI that allow for unreasonable values. To help track these down, I'm recording the large values in a table named outlier_change.
  • I'd like to push this behavior into server-side function so that any of our servers, regardless of their codebase version, can invoke the current logic.
  • The client servers compose and send an email with a result summary, when outliers are found and corrected.

So, a server-side function to do everything, log some data, and return a result. I've got that working, but it's got the smell of You Don't Know What You're Doing So Just Keep Adding Code Until it Works. I've at least got a better handle on using FORMAT and think I understand now that a single function can do many things, and that you can choose what to return with the RETURN clause. For reference, the various bits of code:

    CREATE TABLE IF NOT EXISTS data.outlier_rule (
        id uuid NOT NULL DEFAULT extensions.gen_random_uuid(),
        schema_name text NOT NULL DEFAULT NULL,
        table_name text NOT NULL DEFAULT NULL,
        column_name text NOT NULL DEFAULT NULL,
        threshold integer,
        set_to integer,

    CONSTRAINT outlier_rule_id_pkey
        PRIMARY KEY (schema_name,table_name,column_name)
    );

For tracking the modifications, I've got a second table named outlier_change:

    ------------------------------
    -- Table
    ------------------------------
    DROP TABLE IF EXISTS data.outlier_change CASCADE;

    CREATE TABLE IF NOT EXISTS data.outlier_change (
        id uuid NOT NULL DEFAULT NULL,
        outlier_rule_id uuid NOT NULL DEFAULT NULL,
        value_was integer NOT NULL DEFAULT NULL,
        set_to integer NOT NULL DEFAULT NULL,
        change_count integer NOT NULL DEFAULT 0,
        last_changed_dts timestamptz NOT NULL DEFAULT NOW(),

    CONSTRAINT outlier_change_id_pkey
        PRIMARY KEY (id,outlier_rule_id)
    );

    ALTER TABLE data.outlier_change OWNER TO user_change_structure;

    ------------------------------
    -- Trigger Function
    ------------------------------
    CREATE OR REPLACE FUNCTION data.on_outlier_change_upsert()
      RETURNS pg_catalog.trigger AS $BODY$
    BEGIN

        NEW.last_changed_dts := NOW();
        NEW.change_count     := OLD.change_count + 1;
        RETURN NEW;          -- important!

    END;

    $BODY$
      LANGUAGE plpgsql VOLATILE
      COST 100;

    ------------------------------
    -- Trigger
    ------------------------------
    CREATE TRIGGER outlier_change_upsert BEFORE INSERT OR UPDATE ON data.outlier_change
    FOR EACH ROW
    EXECUTE PROCEDURE data.on_outlier_change_upsert();

DROP FUNCTION IF EXISTS data.outlier_fix ();
CREATE OR REPLACE FUNCTION data.outlier_fix ()

RETURNS TABLE (
   schema_name text,
   table_name  text,
   column_name text,
   id          uuid,
   value_was   integer,
   set_to      integer,
   change_count integer
)

AS $$

DECLARE
    rule record;
    now_ timestamptz = NOW();

BEGIN

    FOR rule IN SELECT * FROM data.outlier_rule LOOP

        EXECUTE FORMAT (
       'INSERT INTO outlier_change (
                        outlier_rule_id,
                        set_to,
                        id,
                        value_was)

               SELECT %6$L,
                        %5$s,
                        %2$I.id,
                        %2$I.%3$I

                 FROM %1$I.%2$I

                WHERE %3$I > %4$s

                   ON CONFLICT(id,outlier_rule_id) DO UPDATE SET
                       value_was = EXCLUDED.value_was,
                       set_to    = EXCLUDED.set_to

          RETURNING outlier_rule_id,
                      id,
                      value_was,
                      set_to
                      change_count;

             UPDATE %1$I.%2$I 
                SET %3$I = %5$s
              WHERE %3$I > %4$s;',

                rule.schema_name,
                rule.table_name,
                rule.column_name,
                rule.threshold,
                rule.set_to,
                rule.id);

 END LOOP;

  RETURN QUERY EXECUTE ('
        SELECT outlier_rule.schema_name,
              outlier_rule.table_name,
              outlier_rule.column_name,
              outlier_change.id,
              outlier_change.value_was,
              outlier_change.set_to,
              outlier_change.change_count 

              FROM outlier_change 
              JOIN outlier_rule ON (outlier_rule.id = outlier_change.outlier_rule_id)

             WHERE last_changed_dts = $1')
       USING now_;


    END;
    $$ LANGUAGE plpgsql;

ALTER FUNCTION data.outlier_fix() OWNER TO user_bender;
3
  • "I would like to record the modified values in another table" - why not do that before updating them? Just INSERT INTO another_table SELECT * FROM analytic_productivity WHERE points > 1000;, then run your UPDATE statement without caring for OLD values. Sure, you'd need to duplicate the outlier condition, but I doubt that matters. Commented Oct 12, 2019 at 22:46
  • Thanks, I've taken a stab at this and have hit another wall, but I've been learning along the way. Commented Oct 13, 2019 at 4:30
  • Oh, you still need the data back in the calling script, you don't just want to insert them in that table? Then I guess there's no way around @Mehrdad's approach. Commented Oct 13, 2019 at 4:37

1 Answer 1

1

You could achieve that with a bit of a hack. You can self join the table in your update query like this:

UPDATE analytic_productivity NEW
  SET points = 1000
  FROM analytic_productivity OLD
  WHERE NEW.points > 1000
  and NEW.id = OLD.id
  RETURNING NEW.id,
      NEW.points,
      OLD.points as old_points;
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.