1

I am trying to extract all occurrences of a substring within a string using Python Regex. This is what I have tried:

import re
line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
m = re.findall(r'\d+x.*?[a-zA-Z]', line)
print (m)

The output I am getting is ['10x35c', '30x35c']

The output I am trying to achieve is ['10'x20'', '10x35cm', '30x35cm']

2
  • You may use this regex: \d+'?x\d+'?(?:[a-zA-Z]+)? Commented Jan 12, 2021 at 16:59
  • Try to explain in english what you are trying tô extract Commented Jan 12, 2021 at 16:59

3 Answers 3

1

You may use this regex:

r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?"

RegEx Demo

Code:

>>> import re
>>> line = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
>>> print (re.findall(r"\d+['\"]?x\d+['\"]?(?:\s*[a-zA-Z]+)?", line))
["10'x20'", '10x35cm', '30x35cm']

RegEx Details:

  • \d+: Match 1+ digits
  • ['\"]?: Match optional ' or "
  • x: Match letter x
  • \d+: Match 1+ digits
  • ['\"]?: Match optional ' or "
  • (?:\s*[a-zA-Z]+)?: Match optional units comprising 1+ letters
Sign up to request clarification or add additional context in comments.

Comments

1

You can do this without regex using split:

In [1089]: m = [i.split(':')[1].strip() for i in line.split(',')]

In [1090]: m
Out[1090]: ["10'x20'", '10x35cm', '30x35cm']

1 Comment

Thank you for your kind effort, but I wanted to use regex though.
0

Use

import re
string = "The dimensions of the first rectangle: 10'x20', second rectangle: 10x35cm, third rectangle: 30x35cm"
print(re.findall(r"""\d+'?x\d+'?(?: *[a-z]+)?""", string, re.I))

Results: ["10'x20'", '10x35cm', '30x35cm']

See Python proof. re.I stands for case insensitive matching.

Explanation:

--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  '?                       '\'' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  x                        'x'
--------------------------------------------------------------------------------
  \d+                      digits (0-9) (1 or more times (matching
                           the most amount possible))
--------------------------------------------------------------------------------
  '?                       '\'' (optional (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
     *                       ' ' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    [a-z]+                   any character of: 'a' to 'z' (1 or more
                             times (matching the most amount possible))
--------------------------------------------------------------------------------
  )?                       end of grouping

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.