4

I am reading a text file with >10,000 number of lines.

results_file = open("Region_11_1_micron_o", 'r')

I would like to skip to the line in the file after a particular string "charts" which occurs at around line no. 7000 (different for different files). Is there a way to conveniently do that without having to read each single line of the file?

1

3 Answers 3

5

If you know the precise line number then you can use python's linecache module to read a particular line. You don't need to open the file.

import linecache

line = linecache.getline("test.txt", 3)
print(line)

Output:

chart

If you want to start reading from that line, you can use islice.

from itertools import islice

with open('test.txt','r') as f:
    for line in islice(f, 3, None):
        print(line)

Output:

chart
dang!
It
Works

If you don't know the precise line number and want to start after the line containing that particular string, use another for loop.

with open('test.txt','r') as f:
    for line in f:
        if "chart" in line:
            for line in f:
                # Do your job
                print(line) 

Output:

dang!
It    
Works

test.txt contains:

hello
world!
chart
dang!
It
Works

I don't think you can directly skip to a particular line number. If you want to do that, then certainly you must have gone through the file and stored the lines in some format or the other. In any case, you need to traverse atleast once through the file.

Sign up to request clarification or add additional context in comments.

5 Comments

linecache internally reads whole file into memory, so it's contradiction to OPs 'Is there a way to conveniently do that without having to read each single line of the file' need.
@erhesto Yes, but I think if you want to go somewhere, you need to have the data somewhere, right? Take for example a list. How will you go to a particular line when you don't have the data stored somewhere. Correct me If I am wrong.
Well, I totally agree with you! I'd just add this information to your answer that it might be problematic to find deterministic algorithm which might accomplish this task without reading the file at least once. Of course, it might be possible in some cases (for example - if we have predefined number of characters per line - in other words, we do know exact places of line breaks), but not in general.
Thank you. The thing I do not always know the exact line number. I have to look for a certain string in the text file and start with the next line.
@DPdl Then in that case you will have to go line by line. I shall update my answer. But if you have a rough idea of the line number then probably you can make it faster by skipping some of the lines as given in my answer.
1

You can use itertools.dropwhile to consume the lines up to the point you want.

from itertools import dropwhile, islice

with open(fname) as fin:
    start_at = dropwhile(lambda L: 'Abstract' not in L.split(), fin)
    for line in islice(start_at, 1, None):
        print line

Comments

1

If your text file has lines whose length is evenly distributed across your file you could try with seeking into thefile

from os import stat
size = stat(your_file).st_size
start = int(0.65*size)
f = open(your_file)
f.seek(start)
buff = f.read() 
n = buff.index('\nchart\n')
start = n+len('\nchart\n')
buff = buff[start:]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.