3

Hope you all are fine and doing good!

I have a table items with 2 million+ records the structure looks like below:

id (int) | price (decimal) | priority (int)
-------------------------------------------
10001    |  59000.25       |    1
10002    |  73000.91       |    2
10003    |  1000.23        |    1
10004    |  9567.18        |    1

The solution that I am looking for is very straightforward: How do I sort this table on price+priority on ASC|DESC order?

Current and working solution: I am using ORDER BY priority ASC, price ASC. But as far as my knowledge goes, sorting on multiple column is slow and not optimised approach (and I am facing realtime slowness because of this).

Solutions I tried: I've added a temporary column to this table:

id (int) | price (decimal) | priority (int) | new_priority (varchar)
--------------------------------------------------------------------
10001    |  59000.25       |    1           | a59000.25
10002    |  73000.91       |    2           | b73000.91
10003    |  1000.23        |    1           | a1000.23
10004    |  9567.18        |    1           | a9567.18

I've replaced 1 => a, 2 => b, 3 => c up till 10 (max number I have in database)

Now, whenever I am trying below SQLs, none of them is working

SELECT * FROM items
ORDER BY new_priority ASC

SELECT * FROM items
ORDER BY new_priority::bytea

SELECT * FROM items
ORDER BY SUBSTRING(new_priority FROM '^(.*?)( \\d+)?$'),
     COALESCE(SUBSTRING(new_priority FROM ' (\\d+)$')::INTEGER, 0)

Please advise!

Links I referred:

  1. Postgresql sorting mixed alphanumeric data
  2. Alphanumeric Sorting in PostgreSQL
  3. Alphanumeric sorting with PostgreSQL
  4. Alphanumeric case in-sensitive sorting in postgres
9
  • 3
    Create an index ASC on both columns. I bet it will improve it a lot. Commented Jun 6, 2017 at 14:16
  • See multicolumn indexes: postgresql.org/docs/current/static/indexes-multicolumn.html Commented Jun 6, 2017 at 14:19
  • 2
    Well, sorting data is a rather slow process, but why do you want to sort 2mio+ records to start with? Usually you'd work on a subset of data and only sort a reasonable amount of rows. Commented Jun 6, 2017 at 14:19
  • 1
    limit and offset will not help since in order to it work the database still do a full scan. The database need to know the total amount so it can know where to start/stop with the limit/offset clause. About the index it would be something like create index idx_blabla on yourTable (priority ASC, price ASC) And for reference read this: postgresql.org/docs/8.3/static/indexes-ordering.html Commented Jun 6, 2017 at 14:35
  • 1
    "and I am facing realtime slowness because of this" - If you have a slow query, then Edit your question and add the execution plan generated using explain (analyze, verbose). Formatted text please, no screen shots Commented Jun 6, 2017 at 15:02

3 Answers 3

2

How about creating an index? Indexes are a mechanism to enhance the performance of databases. Creating an index may be slow and might last some hours, but you'll notice differences in further queries.

Sign up to request clarification or add additional context in comments.

Comments

0

May be you could do a better solution using NEW_PRIORITY as DECIMAL.

Assuming the value in PRICE is not greater then (say) 999999 and the PRIORITY is not too "large", you could use PRIORITY*1000000+PRICE.

Anyway, I think using index as said in comments could be a better solution.

To create a "composite index", you can use following sintax:

CREATE INDEX index_name ON table_name (priority ASC, price ASC);

3 Comments

I already did brainstorming on this approach and found some loopholes due to which I had to drop this approach. I have various cases (for e.g. where priority is 1 and price is 1000.0 and priority is 2 and price is 5000.0 . Secondly, price can go upto any number but yes priority is not large . Basically, I am calculating priority (based on some rules) before saving it into the database.
nopes @etsa consider this one: price=50000.5 and priority=10 and price=75000.0 and priority=5 and price=100.25 and priority=1. I always expect to see last one first and first one towards the end of search results.
OK, before I red your text fast and I saw your example where you put the letter for priority before the price. I think composite index is a better solution (as I wrote in my post, edited before), but - just for curiosity - to compose a new column and sort it as desired, you should add a number of fixed '0' or spaces (' ') in the new composite column and then priority (eg. 0000050000.500j and 0000075000.000e and 0000000100.250a)
0

How often does the data change ? If it's not often, use CLUSTER. See : https://www.postgresql.org/docs/9.5/static/sql-cluster.html It will basically sort your table in the order of your index. Combine this cluster with the answer from etsa.

The catch is, the CLUSTER doesn't maintain the order. So if you have new row inserted, it will be inserted to the last row.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.