0

The below example demonstrates using dask delayed funtions (ref) from within postgres plpython while using "plpy.execute" (ref) to query the database.

It returns an error:

ERROR: spiexceptions.StatementTooComplex: stack depth limit exceeded

Any idea on what I'm doing wrong? I'm guessing it has something to do with delayed function's async nature and plpy.execute not liking that.

Versions:

  • postgresql 15
  • postgres's embedded python version 3.8

Example:

DO
LANGUAGE plpython3u
$$

    # https://docs.dask.org/en/stable/dataframe-sql.html#delayed-functions
    from dask import delayed
    
    @delayed
    def do_it():
        rv = plpy.execute("select 2 as a") # << max stack depth limit
        return 0

    plpy.info(do_it().compute())

$$;

Traceback:

ERROR:  spiexceptions.StatementTooComplex: stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack depth limit is adequate.
CONTEXT:  Traceback (most recent call last):
  PL/Python anonymous code block, line 10, in <module>
    plpy.info(do_it().compute())
  PL/Python anonymous code block, line 313, in compute
  PL/Python anonymous code block, line 598, in compute
  PL/Python anonymous code block, line 88, in get
  PL/Python anonymous code block, line 510, in get_async
  PL/Python anonymous code block, line 318, in reraise
  PL/Python anonymous code block, line 223, in execute_task
  PL/Python anonymous code block, line 118, in _execute_task
  PL/Python anonymous code block, line 7, in do_it
    rv = plpy.execute("select 2 as a") # << max stack depth limit
PL/Python anonymous code block

Updates:

  • added traceback
  • made more minimal
5
  • Hi @Shadi, I'm just wondering what you would want to use Dask Delayed into Postgres' embedded Python? Code there is supposed to be minimal right? Commented Feb 13, 2023 at 15:37
  • 1) to perform dask operations on postgres data without leaving the server, 2) yes Commented Feb 13, 2023 at 22:34
  • What I mean is, will you want to parallelize the code inside Postgres Python? Why don't you just write plain sequential Python code? Commented Feb 15, 2023 at 20:15
  • yes, to save the step of moving data out of the postgres server Commented Feb 15, 2023 at 20:45
  • Well, I understand that, my point is: do you really need Dask for a Postgres' embedded computation? This doesn't sound appropriate. Commented Feb 23, 2023 at 10:46

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.