I'm using python 3.6 and I have long lists of id numbers that I'd like to cache in files, but only load them to memory when needed. Ideally, I'd like them to appear as list-type variables that load from file when accessed, as follows.
""" contents of a library file, canned_lists.py """
list_of_ids = []
def event_driven_on_access_of_list_of_ids():
# don't allow access to the empty list yet
global list_of_ids
with open("the_id_file.csv", "r") as f:
list_of_ids = f.readlines()
# now the list is ready, it can be accessed
""" contents of a calling file, script.py """
import canned_lists
for id in canned_lists.list_of_ids: # at this point, list_of_ids should populate from a file
print("do something with the ids")
Alternative option A would be to use a function rather than a variable. This works and is really not bad, but aesthetically, I'd like to import a list and use a list rather than a function.
""" contents of a library file, canned_lists.py """
def get_list_of_ids():
with open("the_file.csv", "r") as f:
return f.readlines()
Alternative option B would be to just store the data in code. The lists can be nearly 20,000 ids, so option B is clumsy and hard to manage. It also causes PyCharm to tell me my file sizes exceed configured limits of Code Insight features. I could probably increase the limits, but it seems more reasonable to just move data out of code.
""" contents of a library file, canned_lists.py """
list_of_ids = [123, 124, 125, 126, ]
Does python have a way to support changing a variable 'on access' to support the top option? Or does anyone have a better idea? I'll probably implement alternative A as it's perfectly functional, but I'm eager to learn from more advanced pythonistas.