2

I'm a beginner with python, and I'm trying to learn through google and some books... I'm working on a specific project and doing ok with it so far...

The first part of my program takes an input text file, and scans it for certain data within the lines, it then writes the line back out to a new file if it doesn't satisfy the search criteria...

What I've done is ugly as hell, but it's also very slow... When I run it on a Raspberry Pi, this part takes 4 seconds alone (input file is just over 1700 lines of text)

Here's my effort:

    with open('mirror2.txt', mode='r') as fo:
        lines = fo.readlines()
        with open('temp/data.txt', mode='w') as of:
            for line in lines:
                date = 0
                page = 0
                dash = 0
                empty = 0
                if "Date" in line: date += 1
                if "Page" in line: page += 1
                if "----" in line: dash += 1
                if line == "\n":   empty += 1
                sum = date + page + dash + empty
                if sum == 0:
                    of.write(line)
                else:()

I'm embarrassed to show that in public, but I'd love to see a 'pythonic' way to do it more elegantly (and quicker!)

Anyone help?

5
  • Don't want to use sum, Python already does as built-in. How about total instead? Commented Aug 7, 2012 at 23:46
  • Is it possible for Date, Page and ---- to be on the same line? If not, you could use elif, that way not all of the ifs will be tested after one of them is true. Also, What's with the else:() part? Commented Aug 7, 2012 at 23:47
  • No, they'll be on different lines, basically I need to scan for those strings, and throw out the lines with the strings in... Removing whitespace at the same time. Commented Aug 7, 2012 at 23:52
  • 1
    This question is better suited for codereview.stackexchange.com Commented Aug 7, 2012 at 23:54
  • Actually @Levon... I thought the 'if' statement needed a corresponding 'else' statement, so that says 'else do nothing'... Is that not necessary? Commented Aug 8, 2012 at 0:45

2 Answers 2

2

To answer your question, here's a pythonic way of doing this:

import re

expr = re.compile(r'(?:Date|Page|----|^$)')
with open('mirror2.txt', mode='r') as infile:
    with open('data.txt', mode='w') as outfile:
        for line in infile:
            if not expr.search(line.strip()):
                outfile.write(line)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that's much nicer to look at! I'll use that, and read up on regex so I can actually understand what it does! I tested it and it didn't yield any speed improvement, but it looks much more efficient than my ugly kludge above!
0

The Pythonic way of reading a file in a line by line basis would be, adapted to your case:

with open('mirror2.txt', mode='r') as fo:
    for line in fo:
        # Rest

If this speeds up your program noticeably, it would mean that the Python interpreter is not doing a very good job at managing memory on ARM processors.

The rest has already been mentioned in comments.

1 Comment

Thanks for that, I didn't know I could address the file directly like that... This gave a bit of speed improvement, but it still seems slow to me. Went from around 4s to around 3s so 25% isn't bad! Maybe the Raspberry Pi hardware is the bottleneck, I'm writing the code on a 64-bit win7 laptop, so it runs instantly there...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.