Remove part of the string using Regex in python

Question

I have text file which contains the information in below format.

2018/03/21-17:08:48.638553  508     7FF4A8F3D704     snononsonfvnosnovoosr
2018/03/21-17:08:48.985053 346K     7FE9D2D51706     ahelooa afoaona woom
2018/03/21-17:08:50.486601 1.5M     7FE9D3D41706     qojfcmqcacaeia
2018/03/21-17:08:50.980519  16K     7FE9BD1AF707     user: number is 93823004
2018/03/21-17:08:50.981908 1389     7FE9BDC2B707     user 7fb31ecfa700
2018/03/21-17:08:51.066967    0     7FE9BDC91700     Exit Status = 0x0
2018/03/21-17:08:51.066968    1     7FE9BDC91700     std:ZMD:

Expected Result

I want to remove part of the string till 3rd space (that is 7FF4A8F3D704). Result should look like

snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:

Solution

I can remove "2018/03/21-17:08:48.638553" with the below code. But I am trying to replace the whole part with ''.

import re
Regex_list = [r'\d{4}/\d{2}/\d{2}-\d{2}:\d{2}:\d{2}.\d{6}']
for p in Regex_list:
    text = re.sub(p, ' ', file)

SpghttCd · Accepted Answer · 2019-01-27 00:44:59Z

1

If this is the exact structure of your text file, why don't you simply cut off the first n uninteresting characters?

for line in txt.splitlines():
    print(line[53:])


#snononsonfvnosnovoosr
#ahelooa afoaona woom                                      
#qojfcmqcacaeia                                             
#user: number is 93823004                                    
#user 7fb31ecfa700                                      
#Exit Status = 0x0                                           
#std:ZMD:

edited Jan 27, 2019 at 0:44

answered Jan 27, 2019 at 0:39

SpghttCd

10.9k2 gold badges23 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

CertainPerformance · Accepted Answer · 2019-01-27 00:25:30Z

0

Because it looks like the first 3 column values won't ever have any spaces in them, match \S+\s+ to get a column value and its associated space-padding to the right, and repeat it 3 times:

output = re.sub(r'(?m)^(?:\S+\s+){3}', '', input)

https://regex101.com/r/YHXTJs/1

answered Jan 27, 2019 at 0:25

CertainPerformance

373k55 gold badges354 silver badges359 bronze badges

Comments

Chris Charley · Accepted Answer · 2019-01-27 01:13:33Z

0

Another approach that uses re.split() (and limits the split to 3 splits). This assumes that there are no spaces in the first three fields.

It splits on 1 or more spaces.

for data in L.splitlines():
    print(re.split(r'\s+', data, 3)[-1])

Output:

snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:

answered Jan 27, 2019 at 1:13

Chris Charley

6,6972 gold badges27 silver badges28 bronze badges

Collectives™ on Stack Overflow

Remove part of the string using Regex in python

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related