1

I have the following list:

lst = ['SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 18847, NULL), NULL, NULL)', 
'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 19188, NULL), NULL, NULL)',
'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9972, 18282, NULL), NULL, NULL)',
'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9977, 19201, NULL), NULL, NULL)',
'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9989, 18635, NULL), NULL, NULL)']

I would like to extract only the string that contains the number in brackets after MDSYS.SDO_POINT_TYPE. How do I do that?

What I tried so far?

op=[]
for i in lst:
    x = (i[46:56])
    y = str('('+x+')')
    op.append(y)

But, the numbers are not always in position 46-56, how do I optimize that?

Desired output:

['(9971, 1884)',
 '(9971, 1918)',
 '(9972, 1828)',
 '(9977, 1920)',
 '(9989, 1863)']

3 Answers 3

2

You can use regular expressions:

import re
>>> [re.findall("MDSYS.SDO_POINT_TYPE\((\d+, \d+)", s)[0] for s in lst]
['9971, 18847', '9971, 19188', '9972, 18282', '9977, 19201', '9989, 18635']
Sign up to request clarification or add additional context in comments.

1 Comment

Indeed, though if the desired output includes parentheses then your output will need to be post-processed. Fairly trivially done with [f'({pair})' for pair in list_of_pairs] or whatever.
1

I am using simply split to break it in list and combining again with string

lst = ['SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 18847, NULL), NULL, NULL)', 
    'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 19188, NULL), NULL, NULL)',
    'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9972, 18282, NULL), NULL, NULL)',
    'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9977, 19201, NULL), NULL, NULL)',
    'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9989, 18635, NULL), NULL, NULL)']

new_lst = []
for st in lst:
  name,points = st.split('MDSYS.SDO_POINT_TYPE(')
  f_num, s_num, *rest_ = points.split(',')
  new_lst.append(f"({f_num},{s_num})")

print(new_lst)

Comments

0

If the numbers between the parenthesis and the NULL can be at different positions, you can use a pattern to first get the values between parenthesis in a capture group.

Then you can find the digits in the group 1 value.

\bMDSYS\.SDO_POINT_TYPE\(([^()]+)\)
  • \bMDSYS\.SDO_POINT_TYPE\( match MDSYS\.SDO_POINT_TYPE(
  • ([^()]+) Capture all between parenthesis in group 1
  • \) Match closing )

See a Python demo ad a Regex demo

Note that in desired output the last digit is missing for the second value.

import re

lst = ['SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 18847, NULL), NULL, NULL)',
       'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9971, 19188, NULL), NULL, NULL)',
       'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9972, 18282, NULL), NULL, NULL)',
       'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9977, 19201, NULL), NULL, NULL)',
       'SDO_GEOMETRY(2001, NULL, MDSYS.SDO_POINT_TYPE(9989, 18635, NULL), NULL, NULL)']

op = []
for s in lst:
    m = re.search(r"\bMDSYS\.SDO_POINT_TYPE\(([^()]+)\)", s)
    if m:
        op.append("({})".format(", ".join(re.findall(r"\d+", m.group(1)))))

print(op)

Output

['(9971, 18847)', '(9971, 19188)', '(9972, 18282)', '(9977, 19201)', '(9989, 18635)']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.