How to replace substring between two other substrings in python?

Question

I have a corpus of text documents, some of which will have a sequence of substrings. The first and last substrings are consistent, and mark the beginning and the end of the parts I want to replace. But, I would also like to delete/replace all substrings that exist between these first and last positions.

origSent = 'This is the sentence I am intending to edit'

Using the above as an example, how would I go about using 'the' as the start substring, and 'intending' as the end substring, deleting both in addition to the words that exist between them to make the following:

newSent = 'This is to edit'

you'll need to be a lot clearer on the rules for defining these substrings, if 'the' and 'intending' are always the defining words, then this is trival via str.split() of course — Chris_Rands
– Chris_Rands, Commented Oct 30, 2019 at 16:00

Tim Biegeleisen · Accepted Answer · 2019-10-30 16:04:01Z

1

You could use regex replacement here:

origSent = 'This is the sentence I am intending to edit'
newSent = re.sub(r'\bthe((?!\bthe\b).)*\bintending\b', '', origSent)
print(newSent)

This prints:

This is  to edit

The "secret sauce" in the regex pattern is the tempered dot:

((?!\bthe\b).)*

This will consume all content which does not cross over another occurrence of the word the. This prevents matching on some earlier the before intending, which we don't want to do.

answered Oct 30, 2019 at 16:04

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Bill Chen · Accepted Answer · 2019-10-30 20:40:24Z

1

I would do this:

s_list = origSent.split()
newSent = ' '.join(s_list[:s_list.index('the')] + s_list[s_list.index('intending')+1:])

Hope this helps.

edited Oct 30, 2019 at 20:40

answered Oct 30, 2019 at 16:01

Bill Chen

1,73914 silver badges24 bronze badges

2 Comments

MJB Over a year ago

I think you missed an = sign in that second line. Should it say "newSent = ' '.join . . ."

Bill Chen Over a year ago

Corrected the answer

Collectives™ on Stack Overflow

How to replace substring between two other substrings in python?

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related