My code takes numbers from a large text file, then splits it to organise the spacing and to place it into a 2-dimensional array. The code is used to get data for a job scheduler that I'm building.
#reading in workload data
def getworkload():
work = []
strings = []
with open("workload.txt") as f:
read_data = f.read()
jobs = read_data.split("\n")
for j in jobs:
strings.append(" ".join(j.split()))
for i in strings:
work.append([float(s) for s in i.split(" ")])
return work
print(getworkload())
The text file is over 2000 lines long, and looks like this:
1 0 1835117 330855 640 5886 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
2 0 2265800 251924 640 3124 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
3 1 3114175 -1 640 -1 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
4 1813487 7481 -1 128 -1 20250 -1 -1 -1 5 3 1 5 8 -1 -1 -1
5 1814044 0 122 512 1.13 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
6 1814374 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
7 1814511 0 55 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
8 1814695 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
9 1815198 0 75 512 2.14 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
10 1815617 0 115 512 1.87 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
…
It takes 2 and a half minutes to run but I can print the returned data. How can it be optimised?