15

Below is the column response from my Postgres table. I want to extract the status from all the rows in my Postgres database. The status could be of varying sizes like SUCCESS as well so I do not want to use the substring function. Is there a way to do it?

<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>

so my table structure is like this

   Column    |            Type             |                        Modifiers                         

-------------+-----------------------------+----------------------------------------------------------

 id          | bigint                      | not null default nextval('events_id_seq'::regclass)
 hostname    | text                        | not null
 time        | timestamp without time zone | not null
 trn_type    | text                        | 
 db_ret_code | text                        | 
 request     | text                        | 
 response    | text                        | 
 wait_time   | text                        | 

And I want to extract status from each and every request. How do i do this?

Below is a sample row. And assume the table name abc_events

id          | 1870667
hostname    | abcd.local
time        | 2013-04-16 00:00:23.861
trn_type    | A
request     | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
response    | <?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status><responseType>COUNTRY_MISSING</responseType><country_info>USA</country_info><phone_country_code>1234</phone_country_code></response>
1
  • request has the xml request and response has the xml response. And the xml that I have shown above is an example of the response. Essentially I am trying to extract the status tag from each and every xml response row. Commented Apr 16, 2013 at 19:49

1 Answer 1

28

Use the xpath() function:

WITH x(col) AS (SELECT '<?xml version="1.0" ?><response><status>ERROR_MISSING_DATA</status></response>'::xml)
SELECT xpath('./status/text()', col) AS status
FROM   x

/text() strips the surrounding <status> tag.
Returns an array of xml - with a single element in this case:

status
xml[]
-------
{ERROR_MISSING_DATA}

Applied to your table

In response to your question update, this can simply be:

SELECT id, xpath('./status/text()', response::xml) AS status
FROM   tbl;

If you are certain there is only a single status tag per row, you can simply extract the first item from the array:

SELECT id, (xpath('./status/text()', response::xml))[1] AS status
FROM   tbl;

If there can be multiple status items:

SELECT id, unnest(xpath('./status/text()', response::xml)) AS status
FROM   tbl;

Gets you 1-n rows per id.

Cast to xml

Since you defined your columns to be of type text (instead of xml, you need to cast to xml explicitly. The function xpath() expects the 2nd parameters of type xml. An untyped string constant is coerced to xml automatically, but a text column is not. You need to cast explicitly.

This works without explicit cast:

  SELECT xpath('./status/text()'
      ,'<?xml version="1.0" ?><response><status>SUCCESS</status></response>')

A CTE like in my first example needs a type for every column in the "common table expression". If I had not cast to a specific type, the type unknown would have been used - which is not the same thing as an untyped string. Obviously, there is no direct conversion implemented between unknown and xml. You'd have to cast to text first: unknown_type_col::text::xml. Better to cast to ::xml right away.

This has been tightened with PostgreSQL 9.1 (I think). Older versions were more permissive.

Either way, with any of these methods the string has to be valid xml or the cast (implicit or explicit) will raise an exception.

Sign up to request clarification or add additional context in comments.

9 Comments

Do you need the ::xml? I was just doing SELECT xpath('...', '<raw>xml</raw>'); and it seems to work.
I edited my question. Essentially what I want is to extract value of a tag from the column that has the xml request/response.
I followed it but I am getting this error LINE 1: select unnest(xpath('./status/text()', request)) from abc_events ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. It is pointing to the xpath function.
@Phrogz: I added a chapter on the topic of casting, since my initial comment wasn't completely correct. A cast is actually needed with a CTE in this case ...
@ronak: I added a bit to my answer. Note the addendum about casting to xml. Also note I had the wrong cast at first. Must be ::xml.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.