1

I have created a PL/pgSQL function that accepts two column names, a "relation", and two table names. It finds distinct rows in one table and inserts them in to a temporary table, deletes any row with a null value, and sets all values of one column to relation. I have the first part of the process using this function.

create or replace function alt_edger(s text, v text, relation text, tbl text, tbl_src text)
    returns void
    language plpgsql as
$func$
begin
    raise notice 's: %, v: %, tbl: %, tbl_src: %', s,v,tbl,tbl_src;
    execute ('insert into '||tbl||' ("source", "target") select distinct "'||s||'","'||v||'" from '||tbl_src||'');
    execute ('DELETE FROM '||tbl||' WHERE "source" IS null or "target" is null'); 
end
$func$;

It is executed as follows:

-- create a temporary table and execute the function twice
drop table if exists temp_stack;
create temporary table temp_stack("label" text, "source" text, "target" text, "attr" text, "graph" text);
 
select alt_edger('x_x', 'y_y', ':associated_with', 'temp_stack','pg_check_table' );
select alt_edger('Document Number', 'x_x', ':documents', 'temp_stack','pg_check_table' );

select * from temp_stack;

Note that I didn't use relation, yet. The INSERT shall also assign relation, but I can't figure out how to make that happen to get something like:

label source target attr graph
:associated_with 638000 ARAS
:associated_with 202000 JASE
:associated_with 638010 JASE
:associated_with 638000 JASE
:associated_with 202100 JASE
:documents A 638010
:documents A 202000
:documents A 202100
:documents B 638000
:documents A 638000
:documents B 124004
:documents B 202100

My challenges are:

  • How to integrate relation in the INSERT? When I try to use VALUES and comma separation I get an "error near select".
  • How to allow strings starting with ":" in relation? I'm anticipating here, the inclusion of the colon has given me challenges in the past.

How can I do this? Or is there a better approach?

Toy data model:

drop table if exists pg_check_table;
create temporary table pg_check_table("Document Number" text, x_x int, y_y text);
insert into pg_check_table values ('A',202000,'JASE'),
('A',202100,'JASE'),
('A',638010,'JASE'),
('A',Null,'JASE'),
('A',Null,'JASE'),
('A',202100,'JASE'),
('A',638000,'JASE'),
('A',202100,'JASE'),
('B',638000,'JASE'),
('B',202100,null),
('B',638000,'JASE'),
('B',null,'ARAS'),
('B',638000,'ARAS'),
('B',null,'ARAS'),
('B',638000,null),
('B',124004,null);
alter table pg_check_table add row_num serial;
select * from pg_check_table;
2
  • What have you tried, and did you run into any errors ? pls share more details. Commented Mar 8, 2022 at 18:10
  • @Anand: This question shares useful details. Way above average. Commented Mar 9, 2022 at 2:24

1 Answer 1

2
-- DROP FUNCTION alt_edger(_s text, _v text, _relation text, _tbl text, _tbl_src text)
CREATE OR REPLACE FUNCTION alt_edger(_s text, _v text, _relation text, _tbl text, _tbl_src text, OUT row_count int)
  LANGUAGE plpgsql AS
$func$
DECLARE
   _sql text := format(
       'INSERT INTO pg_temp.%3$I (label, source, target)
        SELECT DISTINCT $1, %1$I, %2$I FROM pg_temp.%4$I
        WHERE (%1$I, %2$I) IS NOT NULL'
      , _s, _v, _tbl, _tbl_src);
BEGIN
   -- RAISE NOTICE '%', _sql;  -- debug
   EXECUTE _sql USING _relation;
   GET DIAGNOSTICS row_count = ROW_COUNT;  -- return number of inserted rows
END
$func$;

db<>fiddle here

Most importantly, use format() to concatenate your dynamic SQL commands safely. And use the format specifier %I for identifiers. This way, SQL injection is not possible and identifiers are double-quoted properly - preserving non-standard names like Document Number. That's where your original failed.

We could concatenate _relation as string to be inserted into label, too. But the preferable way to pass values to EXECUTE is with the USING clause. $1 inside the SQL string passed to EXECUTE is a placeholder for the first USING argument. Not to be confused with $1 referencing function parameters in the context of the function body outside EXECUTE! (You can pass any string, leading colon (:) does not matter, the string is not interpreted when done right.) See:

I replaced the DELETE in your original with a WHERE clause to the SELECT of the INSERT. Don't insert rows in the first place, instead of deleting them again later.

(%1$I, %2$I) IS NOT NULL only qualifies when both values are NOT NULL. About that:

Don't use the prefix "pg_" for your table names. That's what Postgres uses for system tables. Don't mess with those.

I schema-qualify known temporary tables with pg_temp. That's typically optional as the temporary schema comes first in the search_path by default. But that can be changed (maliciously), and then the table name would resolve to any existing regular table of the same name in the search_path. So better safe than sorry. See:

I made the function return the number of inserted rows. That's totally optional!
Since I do that with an OUT parameter, I am allowed to skip the RETURNS clause. See:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.