Using dask delayed function from within postgresql plpython with "plpy.execute"

Ask Question

Asked 2 years, 9 months ago

Modified 2 years, 9 months ago

Viewed 42 times

The below example demonstrates using dask delayed funtions (ref) from within postgres plpython while using "plpy.execute" (ref) to query the database.

It returns an error:

ERROR: spiexceptions.StatementTooComplex: stack depth limit exceeded

Any idea on what I'm doing wrong? I'm guessing it has something to do with delayed function's async nature and plpy.execute not liking that.

Versions:

postgresql 15
postgres's embedded python version 3.8

Example:

DO
LANGUAGE plpython3u
$$

    # https://docs.dask.org/en/stable/dataframe-sql.html#delayed-functions
    from dask import delayed
    
    @delayed
    def do_it():
        rv = plpy.execute("select 2 as a") # << max stack depth limit
        return 0

    plpy.info(do_it().compute())

$$;

Traceback:

ERROR:  spiexceptions.StatementTooComplex: stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack depth limit is adequate.
CONTEXT:  Traceback (most recent call last):
  PL/Python anonymous code block, line 10, in <module>
    plpy.info(do_it().compute())
  PL/Python anonymous code block, line 313, in compute
  PL/Python anonymous code block, line 598, in compute
  PL/Python anonymous code block, line 88, in get
  PL/Python anonymous code block, line 510, in get_async
  PL/Python anonymous code block, line 318, in reraise
  PL/Python anonymous code block, line 223, in execute_task
  PL/Python anonymous code block, line 118, in _execute_task
  PL/Python anonymous code block, line 7, in do_it
    rv = plpy.execute("select 2 as a") # << max stack depth limit
PL/Python anonymous code block

Updates:

added traceback
made more minimal

edited Feb 10, 2023 at 15:24

asked Feb 9, 2023 at 22:59

Shadi

10.4k5 gold badges49 silver badges73 bronze badges

Hi @Shadi, I'm just wondering what you would want to use Dask Delayed into Postgres' embedded Python? Code there is supposed to be minimal right?

Guillaume EB
– Guillaume EB

2023-02-13 15:37:32 +00:00
Commented Feb 13, 2023 at 15:37
1) to perform dask operations on postgres data without leaving the server, 2) yes

Shadi
– Shadi

2023-02-13 22:34:42 +00:00
Commented Feb 13, 2023 at 22:34
What I mean is, will you want to parallelize the code inside Postgres Python? Why don't you just write plain sequential Python code?

Guillaume EB
– Guillaume EB

2023-02-15 20:15:27 +00:00
Commented Feb 15, 2023 at 20:15
yes, to save the step of moving data out of the postgres server

Shadi
– Shadi

2023-02-15 20:45:46 +00:00
Commented Feb 15, 2023 at 20:45
Well, I understand that, my point is: do you really need Dask for a Postgres' embedded computation? This doesn't sound appropriate.

Guillaume EB
– Guillaume EB

2023-02-23 10:46:03 +00:00
Commented Feb 23, 2023 at 10:46

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Using dask delayed function from within postgresql plpython with "plpy.execute"

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest