3

I've a simple update query (on two large tables) which does never finish.

UPDATE transit_edge te1 SET dates_to_add =
(   SELECT ARRAY_AGG(date)
    FROM transit_edge te2 LEFT OUTER JOIN calendar_dates cd2 ON (te2.service_id = cd2.service_id AND cd2.exception_type = 1)
    WHERE te2.transit_edge_id = te1.transit_edge_id
);

If I only run the inner query with a given id, I get the correct result.

SELECT ARRAY_AGG(date) 
FROM transit_edge te2 LEFT OUTER JOIN calendar_dates cd2 ON (te2.service_id = cd2.service_id AND cd2.exception_type = 1) 
WHERE te2.transit_edge_id = te1.transit_edge_id AND te1.transit_edge_id = 282956

The table count is rather high:

select count(*) from transit_edge;
count
---------
9187885

select count(*) from calendar_dates;
count
----------
10025969

I also updated the postgresql.conf to enable larger memory usage.

#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 2GB   
work_mem = 200MB   
checkpoint_segments = 3
max_connections = 100 
maintenance_work_mem = 64MB

I ran the inner query with a limit of 100 and got the following error message

ERROR:  invalid memory alloc request size 1073741824

Any help is kindly appreciated! Daniel

5
  • 1
    EXPLAIN output for the query that never finishes? Commented Sep 29, 2014 at 8:37
  • Tried that, but I also get no output for explain. Commented Sep 29, 2014 at 8:37
  • No output for EXPLAIN? Without ANALYZE? That suggests it's getting stuck in planning, and that shouldn't happen. Exact PostgreSQL version from SELECT version()? Commented Sep 29, 2014 at 8:39
  • PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit Commented Sep 29, 2014 at 8:57
  • ... and a simple EXPLAIN UPDATE ... hangs indefinitely? Is there anything relevant in pg_locks? Is the postgres backend (identified by running SELECT pg_backend_pid() before running the EXPLAIN UPDATE ...) using 100% CPU? Commented Sep 29, 2014 at 8:59

1 Answer 1

1

Try using something like:

UPDATE transit_edge te1 SET dates_to_add =
(   SELECT ARRAY_AGG(date)
    FROM calendar_dates cd2
    WHERE te1.service_id = cd2.service_id AND cd2.exception_type = 1
);
Sign up to request clarification or add additional context in comments.

4 Comments

Is there any way to see if something actually happens, or do I have to wait till the complete query is finished?
@DanielGerber You can limit the outer (UPDATE) query to a small subset of id's (WHERE te1.transit_edge_id in (1,2,3, etc)) to check the results. If the results are OK - proceed with full query.
@CraigRinger It may not help if the query is doing something stupid (like a cartesian product because of a missing join condition)
This query seems to return with results. But is this really the same query with respect to the left outer join?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.