2

I have specific log messages and I would like to parse it into groups. I would like to make an alternative version in case if my string is more specific.

My logs:

18:48:24:284 => [DEBUG] [xxx.yyy.zzz] [8] Message1
18:48:24:671 => [INFO] [uuu.www.aaa] [8] Method: 'ReturnType MethodName(MethodParameter)'. Line: ~30. Message2

I have written the following regex:

(?<timestamp>\d+:\d+:\d+:\d+.*)\s+=>\s+\[(?<level>\w+)\]\s+\[(?<emmiter>.*)\]\s+\[(?<thread>\d+)\]\s+(?<message>.*)

It parses these messages into specific groups:

timestamp: 18:48:24:284
level: DEBUG
emmiter: xxx.yyy.zzz
thread: 8
message: Message1

timestamp: 18:48:24:671
level: INFO
emmiter: uuu.www.aaa
thread: 8
message: Method: 'ReturnType MethodName(MethodParameter)'. Line: ~30. Message2

But right now I would like to add 2 more groups, in case if they exist: method and Line

So, I would like to get results like this:

timestamp: 18:48:24:284
level: DEBUG
emmiter: xxx.yyy.zzz
thread: 8
method:
line: 
message: Message1

timestamp: 18:48:24:671
level: INFO
emmiter: uuu.www.aaa
thread: 8
method: ReturnType MethodName(MethodParameter)
line: ~30
message: Message2

Can you please help me with that? Everything I do results in parsing only Line1 or only Line2 properly, but I would like to parse them both with one regex.

7
  • 1
    Which lang are you running? Where is your attempts? Commented Jun 28, 2015 at 17:24
  • Which development environment / language are you using ? Please add appropriate tags to your question! Also, please provide the regex expression you have already written! Commented Jun 28, 2015 at 17:28
  • I'm using it in external application. I assume it's java. I have updated post with my current state of regex Commented Jun 28, 2015 at 17:29
  • OK! What application is that? Commented Jun 28, 2015 at 17:30
  • @SQLPolice I'm writing regex parser for LogMX application Commented Jun 28, 2015 at 17:31

1 Answer 1

2

I can suggest the following regular expression:

(?<timestamp>\d+:\d+:\d+:\d+.*)\s+=>\s+\[(?<level>\w+)\]\s+\[(?<emmiter>.*)\]\s+\[(?<thread>\d+)\](?:\s+Method:\s'(?<method>[^']*)'\s*\.)?(?:\s*Line:\s*(?<line>.+)\.)?\s*(?<message>.*)
                                                                                                                     ^^^^^^              ^                  ^^^^       ^

See demo here

I added 2 optional groups with non-capturing groups and a ? quantifier (?:...)? called methodand line.

I suggest using (?<method>[^']*) to capture all symbols other than ' to capture method name, and Line:\s*(?<line>.+)\. to capture line that is too greedy because I am not sure what text you might have there. You can actually adjust the (?<line>.+) part to some more restrictive pattern (I thought of ~?\d+ but no idea if you may have colons or anything else there).

Sign up to request clarification or add additional context in comments.

1 Comment

Please do not hesitate to ask for clarifications in case you have any doubts about this. I posted some considerations above in the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.