0

Context;

After running the following command on my server:

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022 > analisis.txt

I get a text file with thousands of lines like this example:

loggers1/PCRF1_17868/PCRF12_01_03_2022_00_15_39.log:[C]|01-03-2022:00:18:20:183401|140404464875264|TRACKING: CCR processing Compleated for SubId-5281181XXXXX, REQNO-1, REQTYPE-3, SId-mscp01.herpgwXX.epc.mncXXX.mccXXX.XXXXX.org;25b8510c;621dbaab;3341100102036XX-27cf0XXX, RATTYPE-1004, ResCode-5005 |processCCR|ProcessingUnit.cpp|423

(X represents incrementing numbers)

Problem:

The output is filled with unnecessary data. The only string portions I need are the MSISDN,IMSI comma separated for each line, like this:

5281181XXXXX,3341100102036XX

Steps I tried

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022| grep -o -P '(?<=SubId-).*?(?=, REQ)' > analisis1.txt

This gave me the first part of the solution

5281181XXXXX

However, when I tried to get the second string located between '334110' and "-"

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022| grep -o -P '(?<=SubId-).?(?=, REQ)' | grep -o -P '(?<=334110).?(?=-)' > analisis1.txt

it doesn't work.

Any input will be appreciated.

1 Answer 1

2

To get 5281181XXXXX or the second string located between '334110' and "-" you can use a pattern like:

\b(?:SubId-|334110)\K[^,\s-]+

The pattern matches:

  • \b A word boundary to prevent a partial word match
  • (?: Non capture group to match as a whole
    • SubId- Match literally
    • | Or
    • 334110 Match literally
  • ) Close the non capture group
  • \K Forget what is matched so far
  • [^,\s-]+ Match 1+ occurrences of any char except a whitespace char , or -

See the matches in this regex demo.

That will match:

5281181XXXXX
0102036XX

The command could look like

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022 | grep -oP '\b(?:SubId-|334110)\K[^,\s-]+' > analisis1.txt
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for the excellent answer. I wonder how could I concatenate the output of the regex to get both the 5281181XXXXX and 3341100102036XX portions in one line, comma separated. I saw examples using scripts, but maybe there is a more elegant way inside the regex black magic book.
@FelipeLaRotta From what I can see for example on this page you could pipe the output to | tr '\n' ',' so I think the command would be zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022 | grep -oP '\b(?:SubId-|334110)\K[^,\s-]+' | tr '\n' ',' > analisis1.txt Or you can pipe it to awk | awk '{print}' ORS=', '

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.