0

I am using PostgreSQL 11, created tsvector with gin index on column search_fields.

Data in table test

id   |           name           |         search_fields          
-------+--------------------------+--------------------------------

19973 | Ongoing 10x consultation | '10x' 'Ongoing' 'consultation'

19974 |  5x marketing | '5x' 'marketing'

19975 | Ongoing 15x consultation | '15x' 'Ongoing' 'consultation'

The default text search config is set as 'pg_catalog.english'.

Below both queries output 0 rows.

select id, name, search_fields from test where search_fields @@ to_tsquery('ongoing');

id | name | search_fields 
----+------+---------------
(0 rows)


select id, name, search_fields from test where search_fields @@ to_tsquery('simple','ongoing');

id | name | search_fields 
----+------+---------------
(0 rows)

But when I pass string as '10x' or 'consultation' it returns the correct output.

Any idea, why it is not searching for 'ongoing' words?

Afterwards, I have created the triggers using function tsvector_update_trigger() and update the search_fields and also set default_text_search_config to 'pg_catalog.simple' in postgresql.conf file, then I updated the search_fields with search_fields and it output as

select id, name, search_fields from test where search_fields @@ to_tsquery('ongoing');
 id |              name               |              search_fields              
----+---------------------------------+-----------------------------------------
  19973  | Ongoing 10x consultation | '10x':2 'consultation':3 'ongoing':1

This time when I ran query passing 'ongoing' string it output as per the expected result.

select id, name, search_fields from test where search_fields @@ to_tsquery('ongoing');

id   |           name           |         search_fields          
-------+--------------------------+--------------------------------

19973 | Ongoing 10x consultation | '10x':2 'consultation':3 'ongoing':1

19975 | Ongoing 15x consultation | '15x':2 'consultation':3 'ongoing':1

As per above experiment, setting trigger and default_text_search_config to 'pg_catalog.simple' help to achieve the result.

Now, I don't know what is the reason why it didn't work with default_text_search_config to 'pg_catalog.english'.

Is trigger always required when tsvector is used?

Any help in understanding the difference between both would be appreciated.

Thanks, Nishit

2
  • What data type was search_fields originally? Doesn't look like ts_vector to me. Commented Sep 4, 2019 at 16:09
  • What are the code snippets in your question supposed to achieve? I only see added noise. Commented Sep 4, 2019 at 17:29

1 Answer 1

1

You don't describe how you create your search_fields initially. It was not constructed correctly. Since we don't know what you did, we don't know what you did wrong. If you rebuild it correctly, then it will start working. When you changed default_text_search_config to 'simple', you appear to have correctly repopulated the search_fields, which is why it worked. If you change back to 'english' and correctly repopulate the search_fields then it will also work.

You don't always need a trigger. A trigger is one way. Another way is to just manually update the tsvector column every time you update the text column. My usual favorite way is not to store the tsvector at all, and just derive it on the fly:

select id, name, search_fields from test where 
    to_tsvector('english',name) @@ to_tsquery('english','ongoing');

If you want to do it this way, you need to specify the configuration, not rely on default_text_search_config, otherwise the expressional gin index will not be used. Also, this way is not a good idea if you want to use phrase searching, as the rechecking will be slow.

Sign up to request clarification or add additional context in comments.

4 Comments

This is our insert query Insert into test(name, search_fields) values('ongoing', 'ongoing consultation'::tsvector) It output as name | search_fields ------------+-------------------- Ongoing | 'ongoing' Is this the correct way to type cast or should we use the following query Insert into test(name, search_fields) values('ongoing', to_tsvector('ongoing consultation')) It output as name | search_fields -------------------------------- Ongoing | 'ongoing':1 Which one is the correct way? Thanks
You need to use to_tsvector. It converts english text (or whatever your configuration is) to tsvector, and so does stemming ('ongoing' to 'ongo') and stop words. Casting to tsvector with '::' means you are supposed to have already done that stuff yourself, which you hadn't done.
Thanks. Got it. Is there a way to convert casted :: tsvector values to normalized tsvector form with the position? For eg. values stored in search_fields as '10x' 'Ongoing' 'consultation' convert to '10x':2 'consultation':3 'ongoing':1
I tried this query and it worked update test set search_fields=to_tsvector(search_fields::text);. I hope this is the correct way. Thanks for all your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.