-1

I am trying to grab values from a log file using Python's regular expression and in that process I have a few if statements. The section of the code which grabs the values is as follows:

# Opening the log file for reading
with open(logFile, 'r') as logfile_read:
for line in logfile_read:
    line = line.rstrip()

# To extract Time or iteration
    if 'Time' in line:
        iteration_time = re.findall(r'^Time\s+=\s+(.*)', line)

# To extract local, global and cumulative values

    if 'local' in line:
        local_global_cumu = re.search(r'sum\s+local\s+=\s+(.*),\s+global\s+=\s+(.*),\s+cumulative\s+=\s+(.*)', line)
        if local_global_cumu:
            contLocal_0_value = local_global_cumu.group(1)
            contGlobal_0_value = local_global_cumu.group(2)
            contCumulative_0_value = local_global_cumu.group(3)
        for t in iteration_time:
            contLocal.write("%s\t%s\n" %(t, contLocal_0_value))
            contGlobal.write("%s\t%s\n" %(t, contGlobal_0_value))
            contCumulative.write("%s\t%s\n" %(t, contCumulative_0_value))

    # To extract execution and cpu time

    if 'ExecutionTime' in line:
        execution_cpu_time = re.search(r'^ExecutionTime\s+=\s+(.*)\s+s\s+ClockTime\s+=\s+(.*)\s+s', line)
        if execution_cpu_time:
           execution_time_0_value = execution_cpu_time.group(1)
           cpu_time_0_value = execution_cpu_time.group(2)
        for t in iteration_time:
            print t

In the second if statement, I am able to get values of t. However, in the subsequent if statement, when I try to print t, nothing comes. I am not sure where I have gone wrong.

10
  • 2
    If is not a loop. And you're using if incorrectly. Commented Jan 20, 2015 at 2:25
  • I'd guess that iteration_time is empty, or the body of if ('ExecutionTime' or 'ClockTime') in line: is never executed. Commented Jan 20, 2015 at 2:27
  • Thanks @Ashwini Chaudhary. Can you give me some hints please? Commented Jan 20, 2015 at 2:28
  • if ('ExecutionTime' or 'ClockTime') in line: is working correctly. I tested this by printing both execution_time_0_value and cpu_time_0_value and they are successful. Thanks @Rawing Commented Jan 20, 2015 at 2:30
  • Another thing I note is, if I inset print iteration_time immediately before the second if statement I get all the values, however if the same is done immediately after the if statement, I get empty list of values. Commented Jan 20, 2015 at 2:37

1 Answer 1

1

The following checks if "Time" is a substring in the line, then attempts to find all matches on that line that begins with "Time"...

if 'Time' in line:
    iteration_time = re.findall(r'^Time\s+=\s+(.*)', line)

The following also contains the word "Time":

if 'ExecutionTime' in line:
    execution_cpu_time = re.search(r'^ExecutionTime\s+=\s+(.*)\s+s\s+ClockTime\s+=\s+(.*)\s+s', line)

When it attempts to loop over iteration_time it will be empty as the previous if has already run and the condition that it starts with "Time" means you get an empty list for its matches.

Let's just pretend you have a single line, starting with "ExecutionTime", and let's walk through it...

  • if 'Time' in line is true, so the re.findall runs and returns all matches for the line that starts with 'Time'... This will be empty because the line doesn't start with 'Time' - so iteration_time = []
  • if 'ExecutionTime' in line is true, and the line does start with 'ExecutionTime', when you do the for t in iteration_time - it won't loop, because the above has set it to be empty!
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks @Jon Clements. In my log file Time and ExecutionTime are in two different line and also the syntx for regex uses ^ which only checks for line starting with. Both my Time and ExecutionTime is at the start of two different lines.
Yes, but "Time" is in "ExecutionTime" which means that first if actually runs... the findall will return nothing because the line doesn't start with "Time" it starts with "ExecutionTime"... get it? So when the if 'ExecutionTime' block runs, iteration_time will be emptied by the first if...
Thanks @JonClements. However, I am still a bit confused. Look at my typical log file here: stackoverflow.com/questions/28017121/…
@Deepak it doesn't matter how your data looks, it's just logic... I've tried to spell it out with an update to the answer - if you don't get that, there's not much more I can do I'm afraid :)
@Deepak anyway - probably a quick fix is to change your line from if 'Time' in line to be if line.startswith('Time')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.