0

I have a problem with an INSERT in PostgreSQL. I have this query:

INSERT INTO track_segments(tid, gdid1, gdid2, distance, speed)
SELECT * FROM (
SELECT DISTINCT ON (pga.gdid) 
pga.tid as ntid,
pga.gdid as gdid1, pgb.gdid as gdid2,
ST_Distance(pga.geopoint, pgb.geopoint) AS segdist, 
(ST_Distance(pga.geopoint, pgb.geopoint) / EXTRACT(EPOCH FROM (pgb.timestamp - pga.timestamp + interval '0.1 second'))) as speed
FROM fl_pure_geodata AS pga
LEFT OUTER JOIN fl_pure_geodata AS pgb ON (pga.timestamp < pgb.timestamp AND pga.tid = pgb.tid) 
ORDER BY pga.gdid ASC) AS sq
WHERE sq.gdid2 IS NOT NULL;

to fill a table with pairwise connected segements of geopoints. When I run the SELECT alone I get the correct pairs, but when I use it in the statement above, then some are paired the wrong way or not at all. Here's what I mean:

result of SELECT alone:

tid;gdid1;gdid2;distance;speed
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";10;11;34.105058803;31.0045989118182
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";11;12;90.099603143;14.7704267447541
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";12;13;23.331326565;21.2102968772727

result after INSERT with the same SELECT:

tid;gdid1;gdid2;distance;speed
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";10;12;122.574;17.2639603638028
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";11;12;90.0996;14.7704267447541
"0f6fd522-5f1e-49a4-b85e-50f11ef7f908";12;13;23.3313;21.2102968772727

What be the cause of that? It's exactly the same SELECT statement for the INSERT, so why does it give different results?

3
  • 2
    BTW: the WHERE sq.gdid2 IS NOT NULL; effectively transforms the LEFT join into a plain join. Commented Mar 11, 2016 at 16:17
  • 1
    Note 2: the (pga.timestamp < pgb.timestamp AND pga.tid = pgb.tid) join condition produces an (N*(N-1)/2) sized (partial) carthesian produkt per tid. Is that what you want? Commented Mar 11, 2016 at 16:25
  • @joop oh, that's right. I initially wanted to have the nulls displayed, to see if the last point correctly doesn't connect to the next one with different tid. It was kinda late when I wrote that, forgot about the JOINs. Thanks for the correction! Commented Mar 11, 2016 at 17:02

2 Answers 2

2

DISTINCT ON (pga.gdid) can pick any row from a set with equal pga.gdid. You can get different result even by execution the same query for several times. Add additional ordering to get consistent results. something like: pga.gdid ASC, pgb.gdid ASC

BTW You may want to order by pga.gdid ASC, pgb.timestamp - pga.timestamp ASC to get the "next" point.

BTW2 It may be easier to use lead() or lag() window functions to calculate differences between current row and next/previous. This way you wont need a self join and will likely get better performance.

Sign up to request clarification or add additional context in comments.

4 Comments

ah, yeah! That makes sense, I hope it works. The old query took 5 seconds, now with the additional order it's already at 2min and counting...
"getting correct result" often takes more time than "getting some result".
@pfannkuchen_gesicht see my edit for a possible way to get better performance.
Just tested with lead()... and holy crap, it only takes 0.3% of the time the original query took.
1

You are ordering your query results only by the column pga.gdid, which is the same in all the rows, so postgres will order the results in a different way each time you do the select query.

4 Comments

what do you mean? The gdid are sequential, the first column is the tid, the second and third columns are the gdids
about your edit: I'm not ordering by the pga.tid though.
Sorry, im at the phone. I mean that if you order by one column and you have múltiple results with the same value the order of the results is going to be different each time.
ah, I see, I have to order by pgb.gdid as well to make it work properly. Alright thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.