i am opening this question because it seems my original question requires a new direction: my original question
i would like to create a regular expression that can extract STATIC MESSAGE and DYNAMIC MESSAGE from the following types of log-entries:
/long/file/name/with.dots.and.extension:Jan 01 12:00:00 TYPE Static Message;Dynamic Message
/long/file/name/with.dots.and.extension:Jan 01 12:00:00 MODULE.NAME TYPE THREAD.OR.CONNECTION.INFORMATION Static Message;Dynamic Message
one log entry type has a simple structure:
file:date TYPE STATIC;DYNAMIC
the other is not so simple when trying to be parsed with regex:
file:date MODULE.NAME TYPE CONNECTION.OR.THREAD STATIC;DYNAMIC
where the MODULE.NAME and CONNECTION.OR.THREAD are either both present or not present.
my regular expression so far which works on the first type of log entry is:
(?:.*?):(?:\w{3} \d{1,2} \d{1,2}:\d{1,2}:\d{1,2})(?:\s+?)(?:[\S|\.]*?(?:\s*?))?(?:(?:TYPE1)|(?:TYPE2)|(?:TYPE3))(?:\s+?)(?:\S+?(?:\s+?))?(.+){1}(?:;(.+)){1}
but whenever i get to the second type of entry, i am also getting the CONNECTION.OR.THREAD as part of my first capturing group.
i am hoping for a way to use the lookahead or lookbehind feature so that i can capture STATIC and DYNAMIC and ignore the CONNECTION.OR.THREAD part if there is a MODULE.NAME ?
i hope this question is clear, please refer to my original if it seems a bit bleak. thank you.
EDIT: for clarification. every line of the log is different then the others, each line starts with a filepath, then a : then the date, in the following format: MMM DD HH:MM:SS and then it gets tricky, either a MODULE.NAME which varies, followed by the TYPE which also varies, followed by CONNECTION.OR.THREAD which varies, or with just the TYPE. after which there is the STATIC MESSAGE then a ; then a DYNAMIC MESSAGE both the static and dynamic message vary, the usage of the term STATIC is simply because an error can be for instance "unable to connect to server; server1.com" so the static part of the error is "unable to connect to server" and the dynamic part is "server1.com"