13

What are the performance implications in postgres of using an array to store values as compared to creating another table to store the values with a has-many relationship?

I have one table that needs to be able to store anywhere from about 1-100 different string values in either an array column or a separate table. These values will need to be frequently searched for exact matches, so lookup performance is critical. Would the array solution be faster, or would it be faster to use joins to lookup the values in the separate table?

3
  • 8
    Optimization is the last step. Do the right thing first which is proper normalization. Commented May 22, 2014 at 15:20
  • It depends on many factors, including which indexes, and of what type, you might use on each field, how you'll be querying the data, and many other things. I agree with @ClodoaldoNeto's comment... get your code working, then worry about optimization. Commented May 22, 2014 at 15:39
  • 1
    BTW if you age going to store strings in array, you may want to add a GIN index on this array. Read the postgres documents for deatails on GIN indexes. Commented May 22, 2014 at 16:47

2 Answers 2

4

These values will need to be frequently searched

Searched how? This is crucial.

Prefix pattern match only? Infix/suffix pattern matches too? Fuzzy string search / similarity matching? Stubbing and normalization for root words, de-pluralization? Synonym search? Is the data character sequences or natural language text? One language, or multiple different languages?

Hand-waving around "searched" makes any answer that ignores that part pretty much invalid.

so lookup performance is critical. Would the array solution be faster, or would it be faster to use joins to lookup the values in the separate table?

Impossible to be strictly sure without proper info on the data you're searching.

Searching text fields is much more flexible, giving you many options you don't have with an array search. It also generally reduces the amount of data that must be read.

In general, I strongly second Clodaldo: Design it right. Optimize later, if you need to.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the answer. The string values will need to be searched for exact string matches of unicode text.
4

According to the official PostgreSQL reference documentation, searching for specific elements in a table is expected to perform better than in an array https://www.postgresql.org/docs/current/arrays.html#ARRAYS-SEARCHING :

Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

The reason for the worse search performance on array elements than on tables could be that arrays are internally stored as strings as stated here https://www.postgresql.org/message-id/op.swbsduk5v14azh%40oren-mazors-computer.local

the array is actually stored as a string by postgres. a string that
happens to have lots of brackets in it.

although I could not corroborate this statement by any official PostgreSQL documentation. I also do not have any evidence that handling well-structured strings is necessarily less performant than handling tables.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.