0

I use a for loop in a script to compute geometries according to values in a series, and insert the results into a table (t1). Is there a way to rewrite the loop below using parallelism? What approach to use? I found many docs on how PostgreSQL runs queries in parallel, but hardy anything on how I could perform such task in my case. Is it even possible using SQL? or should I rely on other languages like Python, and execute the SQL queries from there?

DO $$
DECLARE
  lims int[];
  lim int;
BEGIN
  SELECT ARRAY(SELECT generate_series(60, 300, 60)) INTO lims;
  CREATE TABLE t1(
    lim integer,
    the_geom geometry(polygon, 4326),
  );
  FOREACH lim IN ARRAY lims
  LOOP
    WITH conv_hull AS (
      SELECT ST_ConcaveHull(...I use lim here...) AS conv_hull
    )
    INSERT INTO t1 (lim, the_geom) SELECT lim, ST_SetSRID(conv_hull.ST_ConcaveHull, 4326) FROM conv_hull;
  END LOOP;
END $$;

2 Answers 2

3

Parallelism in Postgres is something that the query planner does for you to process big, qualifying SQL statements. You cannot launch parallel operations on demand in PL/pgSQL.

You could rewrite your example to replace the PL/pgSQL loop with a set-based solution in plain SQL, like:

BEGIN;

CREATE TABLE t1(
  lim integer
, the_geom geometry(polygon, 4326)
);

WITH conv_hull AS (
   SELECT ST_ConcaveHull(...) AS conv_hull
   )
INSERT INTO t1 (lim, the_geom)
SELECT generate_series(1,10), ST_SetSRID(conv_hull.ST_ConcaveHull, 4326)
FROM   conv_hull;

COMMIT;

If your server is configured properly, the query gets big enough, and all involved operations allow parallelism, then the planner will schedule multiple worker processes - and actually fire them up if resources are available.

generate_series() is PARALLEL SAFE, for starters. Read details in the manual chapter "Parallel Query".

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, the thing is that I need the lim value in the "WITH conv_hull AS (SELECT ST_ConcaveHull(...) AS conv_hull) statment. It was not clear in my question, I edited it...
@Marc: if you move the generate_series() into the FROM part, you can access the generated number like a column.
0

Alright, thanks! I got it working with the full statement:

SELECT ST_ConcaveHull(the_geom, 0.95) FROM (
  SELECT ST_Union(b.the_geom) AS the_geom
  FROM generate_series(60, 300, 60) AS lim, pgr_drivingDistance(
    'SELECT id, source, target, cost, reverse_cost FROM edge_net', vid, lim, true
  ) a, nodes b WHERE a.node = b.vid
  GROUP BY lim
) AS conv_hull

Same result as the for loop now. However, I don't get a significant speed improvement for the computation which was the whole point for me to find another solution. Probably, parallel processing is not trigged for some reason (although it's configured in the PG conf file)...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.