Regex Search and Replace in Python

Question

I am looking to do a Regex conditional search.

What I am looking to do is if there is Carriage Return (\r) followed by Upper and Lower Case alphabets the I want to add space ('') and remove carriage return but if after carriage there is anything else I just want to replace that. Is there a way I can do that using regex in Python

Sample Input:

BCP-\rEngin\reerin\rg\rSyste\rms\rSupp\rort

Output:

BCP- Engineering Systems Support

Data is in form of dataframe. I am currently using df.replace() function to replace "\r" with spaces (" ") but I would like it to be conditional.

Below is my code -

df_replace = df.replace(to_replace=r"\r", value = " ", regex=True)

I have just basically tried replacing \r with nothing. But I am not sure how will I implement conditional replace. — Abdulquadir Shaikh
– Abdulquadir Shaikh, Commented Aug 27, 2019 at 16:30
What is an "uppercase character" for you? If the "ASCII" range A-Z is good enough for your case, that's easy, but if you want to handle any Unicode upper-case character, that's harder in standard Python regex. — Tomalak
– Tomalak, Commented Aug 27, 2019 at 16:34
Edit your question and show your attempt. It will make it easier for us to help you. — Paolo
– Paolo, Commented Aug 27, 2019 at 16:34

Sajeer Noohukannu · Accepted Answer · 2019-08-27 20:02:24Z

2

I am not familiar with python, but the regex you will need is as follows (perhaps someone with python experience can edit to customize this code):

This will find all \r that precede an uppercase letter, so replace this with an empty string:

\\r(?![A-Z])

This will find all \r that precede a lowercase letter, so replace this with a space:

\\r(?![a-z])

EDIT

Okay, here's one solution in Python I was able to put together for you:

import re

myString = "BCP-\rEngin\reerin\rg\rSyste\rms\rSupp\rort"

myString = re.sub("\\r(?![A-Z])", "", myString)
myString = myString.replace("\r", " ")  # This can be simple string replace

edited Aug 27, 2019 at 20:02

Sajeer Noohukannu

6784 silver badges8 bronze badges

answered Aug 27, 2019 at 16:34

Brigadeiro

2,95718 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Abdulquadir Shaikh Over a year ago

will this work in dataframe - df_replace = df.replace(to_replace=r"\r", value = "@", regex=True) Can I replace your search condition here?

Abdulquadir Shaikh · Accepted Answer · 2019-08-27 18:50:05Z

0

I was able to get the solution for this -

df_replace2 =  df.replace(to_replace = r"(\r)(?![A-Z])", value = "", regex=True)
df_replace3 = df_replace2.replace(to_replace = r"(\r)(?![a-z])", value = " ", regex=True)

Thanks @Brigadeiro for guiding with the solution

answered Aug 27, 2019 at 18:50

Abdulquadir Shaikh

137 bronze badges

Collectives™ on Stack Overflow

Regex Search and Replace in Python

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related