Postgresql function not working as expected with INSERT INTO

Question

I have function to insert data from one table to another

   $BODY$
    BEGIN
            INSERT INTO backups.calls2 (uid,queue_id,connected,callerid2)
            SELECT distinct (c.uid) ,c.queue_id,c.connected,c.callerid2
            FROM public.calls c 
            WHERE c.connected is not null;
        RETURN;
    EXCEPTION WHEN unique_violation THEN NULL;
    END;
$BODY$

And structure of table:

CREATE TABLE backups.nc_calls_id
(
  uid character(30) NOT NULL,
  queue_id integer,
  callerid2 text,
  connected timestamp without time zone,
  id serial NOT NULL,
  CONSTRAINT calls2_pkey PRIMARY KEY (uid)
 )
WITH (
  OIDS=FALSE
);

When I have first executed this query, everything went ok, 200000 rows was inserted to new table with unique Id. But now, when I executing it again, no rows are being inserted

DISTINCT and DISTINCT ON are two different functions. Which one do you need? Use RAISE INFO inside the exception handeling to SER what is going on. — Frank Heikens
– Frank Heikens, Commented Jun 9, 2013 at 8:52

Craig Ringer · Accepted Answer · 2013-06-09 09:13:11Z

3

From the rather minimalist description given (no PostgreSQL version, no CREATE FUNCTION statement showing params etc, no other table structure, no function invocation) I'm guessing that you're attempting to do a merge, where you insert a row only if it doesn't exist by skipping rows if they already exist.

What the above function will do is skip all rows if any row already exists.

You need to either use a loop to do the insert within individual BEGIN ... EXCEPTION blocks (slow) or LOCK the table and do an INSERT INTO ... SELECT ... FROM newtable WHERE NOT EXISTS (SELECT 1 FROM oldtable where oldtable.key = newtable.key).

The INSERT INTO ... SELECT ... WHERE NOT EXISTS method will perform a lot better but will fail if more than one runs concurrently or if anything else inserts into the destination table at the same time. LOCKing the destination table before running it will make sure it's safe.

The PL/PgSQL looping BEGIN ... EXCEPTION method sounds nice and safe at first glance. Then you think about what happens when you run two of them at once. One will insert some keys first, one will insert other keys first, so they have a split of the values between them. That's OK, together they make up the full set. But what if only one of them commits and the other fails for some reason? You'll have an interesting sparsely inserted result. For that reason it's probably best to lock the destination table if using this approach too ... in which case you might as well use the vastly more efficient single pass INSERT with subquery-based uniqueness violation check.

edited Jun 9, 2013 at 9:13

answered Jun 9, 2013 at 8:54

Craig Ringer

329k83 gold badges742 silver badges820 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ssuperczynski Over a year ago

Yes this is exactly what I am looking for. But I didn't know how to write it in LOOP

vyegorov Over a year ago

@infaustus, I think using NOT EXISTS is a better and faster option here. With loop, your function will be running longer and longer as your data will be growing.

Craig Ringer Over a year ago

@vyegorov Totally agree. SAVEPOINTs (as used under the hood by BEGIN ... EXCEPTION blocks) are a lot cheaper than they used to be. It's still better not to use them. In addition to the cost of the savepoint you'e got the cost of all those individual inserts and index lookups instead of efficiently batching everything together.

Collectives™ on Stack Overflow

Postgresql function not working as expected with INSERT INTO

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related