Replacing a string with another using regex in Python

Question

I am trying to replace a selected text with a single word from that selected text using regex. I tried re.sub() but it seems that it takes the second argument "The word that I want to replace it with the text" as a string, not as regex.

Here is my string:

I go to Bridgebrook i go out <ERR targ=sometimes> some times </ERR> on Tuesday night i go to Youth <ERR targ=club> clob </ERR> .

And here is my code:

# The regex of the form <ERR targ=...> .. </ERR>
select_text_regex = r"<ERR[^<]+<\/ERR>"

# The regex of the correct word that will replace the selected text of teh form <ERR targ=...> .. </ERR>
correct_word_regex = r"targ=([^>]+)>"
line = re.sub(select_text_regex, correct_word_regex, line.rstrip())

I get:

I go to Bridgebrook i go out targ=([^>]+)> on Tuesday night i go to
Youth targ=([^>]+)> .

My goal is:

I go to Bridgebrook i go out sometimes on Tuesday night i go to
Youth club .

Does Python support replacing two strings using Regex?

whp · Accepted Answer · 2018-03-21 00:01:19Z

1

Here's another solution (I also rewrote the regex using "non-greedy" modifiers by putting ? after * because I find it more readable).

The group referenced by r"\1" is done with parenthises as an unnamed group. Also used re.compile as a style preference to reduce the number of args:

line = "I go to Bridgebrook i go out <ERR targ=sometimes> some times </ERR> on Tuesday night i go to Youth <ERR targ=club> clob </ERR> ."
select_text_regex = re.compile(r"<ERR targ=(.*?)>.*?<\/ERR>")
select_text_regex.sub(r"\1", line)

Named group alternative:

line = "I go to Bridgebrook i go out <ERR targ=sometimes> some times </ERR> on Tuesday night i go to Youth <ERR targ=club> clob </ERR> ."
select_text_regex = re.compile(r"<ERR targ=(?P<to_replace>.*?)>.*?<\/ERR>")
select_text_regex.sub(r"\g<to_replace>", line)

You can find some docs on group referencing here:

https://docs.python.org/3/library/re.html#regular-expression-syntax

edited Mar 21, 2018 at 0:01

answered Mar 20, 2018 at 23:50

whp

1,51411 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

whp Over a year ago

Keep in mind that this is just a regex solution! A lot of people will recommend that you use a proper parser and a library like beautifulsoup if your use case can be more complicated: crummy.com/software/BeautifulSoup/bs4/doc

Hazem Alabiad Over a year ago

Is it possible to convert the replaced text to upper letters with re.sub()?

whp Over a year ago

This solution works - replace "r\1" with a function: stackoverflow.com/questions/8934477/…

Hazem Alabiad Over a year ago

I used your second implementation: line = select_text_regex.sub(r"\g<to_replace>\1",lambda m: m.group('first').upper(), line) It says: TypeError: 'str' object cannot be interpreted as an integer

jasonharper · Accepted Answer · 2018-03-20 23:42:02Z

0

You would need to match the target word in the pattern, as a capturing group - you can't start an entirely new search in the replacement string!

Not tested, but this should do the job:

Replace r"<ERR targ=(.*?)>.*?</ERR>"

With r"\1"

answered Mar 20, 2018 at 23:42

jasonharper

9,6192 gold badges21 silver badges46 bronze badges

Comments

Ira Casper · Accepted Answer · 2018-03-20 23:43:13Z

0

What you're looking for is regex capture groups. Instead of selecting the regex and then trying to replace it with another regex, put the part of your regex you want to match inside parenthesis in your select statement, then get it back in the replacement with \1. (the number being the group you included)

line = "I go to Bridgebrook i go out <ERR targ=sometimes> some times </ERR> on Tuesday night i go to Youth <ERR targ=club> clob </ERR> ."

select_text_regex = r"<ERR targ=([^<]+)>[^<]+<\/ERR>" #Correct Here.
correct_word_regex = r"\1" #And here.

line = re.sub(select_text_regex, correct_word_regex, line.rstrip())

print(line)

answered Mar 20, 2018 at 23:43

Ira Casper

465 bronze badges

Collectives™ on Stack Overflow

Replacing a string with another using regex in Python

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related