0

I have text file which contains the information in below format.

2018/03/21-17:08:48.638553  508     7FF4A8F3D704     snononsonfvnosnovoosr
2018/03/21-17:08:48.985053 346K     7FE9D2D51706     ahelooa afoaona woom
2018/03/21-17:08:50.486601 1.5M     7FE9D3D41706     qojfcmqcacaeia
2018/03/21-17:08:50.980519  16K     7FE9BD1AF707     user: number is 93823004
2018/03/21-17:08:50.981908 1389     7FE9BDC2B707     user 7fb31ecfa700
2018/03/21-17:08:51.066967    0     7FE9BDC91700     Exit Status = 0x0
2018/03/21-17:08:51.066968    1     7FE9BDC91700     std:ZMD:

Expected Result

I want to remove part of the string till 3rd space (that is 7FF4A8F3D704). Result should look like

snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:

Solution

I can remove "2018/03/21-17:08:48.638553" with the below code. But I am trying to replace the whole part with ''.

import re
Regex_list = [r'\d{4}/\d{2}/\d{2}-\d{2}:\d{2}:\d{2}.\d{6}']
for p in Regex_list:
    text = re.sub(p, ' ', file)

3 Answers 3

1

If this is the exact structure of your text file, why don't you simply cut off the first n uninteresting characters?

for line in txt.splitlines():
    print(line[53:])


#snononsonfvnosnovoosr
#ahelooa afoaona woom                                      
#qojfcmqcacaeia                                             
#user: number is 93823004                                    
#user 7fb31ecfa700                                      
#Exit Status = 0x0                                           
#std:ZMD:                
Sign up to request clarification or add additional context in comments.

Comments

0

Because it looks like the first 3 column values won't ever have any spaces in them, match \S+\s+ to get a column value and its associated space-padding to the right, and repeat it 3 times:

output = re.sub(r'(?m)^(?:\S+\s+){3}', '', input)

https://regex101.com/r/YHXTJs/1

Comments

0

Another approach that uses re.split() (and limits the split to 3 splits). This assumes that there are no spaces in the first three fields.

It splits on 1 or more spaces.

for data in L.splitlines():
    print(re.split(r'\s+', data, 3)[-1])

Output:

snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.