0

I have function to insert data from one table to another

   $BODY$
    BEGIN
            INSERT INTO backups.calls2 (uid,queue_id,connected,callerid2)
            SELECT distinct (c.uid) ,c.queue_id,c.connected,c.callerid2
            FROM public.calls c 
            WHERE c.connected is not null;
        RETURN;
    EXCEPTION WHEN unique_violation THEN NULL;
    END;
$BODY$

And structure of table:

CREATE TABLE backups.nc_calls_id
(
  uid character(30) NOT NULL,
  queue_id integer,
  callerid2 text,
  connected timestamp without time zone,
  id serial NOT NULL,
  CONSTRAINT calls2_pkey PRIMARY KEY (uid)
 )
WITH (
  OIDS=FALSE
);

When I have first executed this query, everything went ok, 200000 rows was inserted to new table with unique Id. But now, when I executing it again, no rows are being inserted

2
  • 1
    have you checked if the values in uid are being repeated ? Commented Jun 9, 2013 at 8:46
  • DISTINCT and DISTINCT ON are two different functions. Which one do you need? Use RAISE INFO inside the exception handeling to SER what is going on. Commented Jun 9, 2013 at 8:52

1 Answer 1

3

From the rather minimalist description given (no PostgreSQL version, no CREATE FUNCTION statement showing params etc, no other table structure, no function invocation) I'm guessing that you're attempting to do a merge, where you insert a row only if it doesn't exist by skipping rows if they already exist.

What the above function will do is skip all rows if any row already exists.

You need to either use a loop to do the insert within individual BEGIN ... EXCEPTION blocks (slow) or LOCK the table and do an INSERT INTO ... SELECT ... FROM newtable WHERE NOT EXISTS (SELECT 1 FROM oldtable where oldtable.key = newtable.key).

The INSERT INTO ... SELECT ... WHERE NOT EXISTS method will perform a lot better but will fail if more than one runs concurrently or if anything else inserts into the destination table at the same time. LOCKing the destination table before running it will make sure it's safe.

The PL/PgSQL looping BEGIN ... EXCEPTION method sounds nice and safe at first glance. Then you think about what happens when you run two of them at once. One will insert some keys first, one will insert other keys first, so they have a split of the values between them. That's OK, together they make up the full set. But what if only one of them commits and the other fails for some reason? You'll have an interesting sparsely inserted result. For that reason it's probably best to lock the destination table if using this approach too ... in which case you might as well use the vastly more efficient single pass INSERT with subquery-based uniqueness violation check.

Sign up to request clarification or add additional context in comments.

3 Comments

Yes this is exactly what I am looking for. But I didn't know how to write it in LOOP
@infaustus, I think using NOT EXISTS is a better and faster option here. With loop, your function will be running longer and longer as your data will be growing.
@vyegorov Totally agree. SAVEPOINTs (as used under the hood by BEGIN ... EXCEPTION blocks) are a lot cheaper than they used to be. It's still better not to use them. In addition to the cost of the savepoint you'e got the cost of all those individual inserts and index lookups instead of efficiently batching everything together.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.