I have some very big files (more than 100 millions lines).
And I need to read their last line.
As I 'm a Linux user, in a shell/script I use 'tail' for that.
Is there a way to rapidly read the last line of a file in python ?
Perhaps, using 'seek', but I 'm not aware with that.
The best I obtain is this :
from subprocess import run as srun
file = "/my_file"
proc = srun(['/usr/bin/tail', '-1', file], capture_output=True)
last_line = proc.stdout
All other pythonic code I tried are slower than calling external /usr/bin/tail
I also read these threads that not satisfy my demand :
How to implement a pythonic equivalent of tail -F?
Head and tail in one line
Because I want some speed of execution and avoid memory overload.
Edit: I try what I understand on comments and …
I get a strange comportment :
>>> with open("./Python/nombres_premiers", "r") as f:
... a = f.seek(0,2)
... l = ""
... for i in range(a-2,0,-1):
... f.seek(i)
... l = f.readline() + l
... if l[0]=="\n":
... break
...
1023648626
1023648625
1023648624
1023648623
1023648622
1023648621
1023648620
1023648619
1023648618
1023648617
1023648616
>>> l
'\n2001098251\n001098251\n01098251\n1098251\n098251\n98251\n8251\n251\n51\n1\n'
>>> with open("./Python/nombres_premiers", "r") as f:
... a = f.seek(0,2)
... l = ""
... for i in range(a-2,0,-1):
... f.seek(i)
... l = f.readline()
... if l[0]=="\n":
... break
...
1023648626
1023648625
1023648624
1023648623
1023648622
1023648621
1023648620
1023648619
1023648618
1023648617
1023648616
>>> l
'\n'
How to get l = 2001098251 ?
os.seekis your friend -- that's the same facility thattailitself uses.seek()with the correct arguments to skip directly to the end and so are just as efficient as tail itself. Just ignore any answer that doesn't refer toos.seekandos.SEEK_END.f.seek(0,2)which return an integer (address to the extremely end of file). How to get the last line ? I don't know its length.seek(-2, os.SEEK_END)and then something likewhile f.read(1) != b'\n': f.seek(-2, os.SEEK_CUR)to get to the beginning of the last line.