I built an API wrapper module in Python with aiohttp that allows me to significantly speed up the process of making multiple GET requests and retrieving data. Every data response is turned into a pandas DataFrame.
Using asyncio I do something that looks like this:
import asyncio
from custom_module import CustomAioClient
id_list = ["123", "456"]
async def main():
client = CustomAioClient()
tasks = []
for id in id_list:
task = asyncio.ensure_future(client.get_latest_value(id=id))
tasks.append(task)
responses = await asyncio.gather(*tasks, return_exceptions=True)
# Close the session
await client.close_session()
return responses
if __name__ == "__main__":
asyncio.run(main())
This returns a list of pandas DataFrames with time series for each id in the id_list that I want to save as csv files. I am a bit confused on how to proceed here.
Obviously I could just iterate over the list and save every DataFrame iteratively, but this seems highly inefficient to me. Is there a way to improve things here?
Edit
I did the following to save things and it is much faster than just iterating over multiple URLs, getting the data and saving it. I doubt whether this fully makes use of the asynchronous functionalities though.
import asyncio
from custom_module import CustomAioClient
async def fetch(client: CustomAioClient, id: str):
df = await client.get_latest_value(id=id)
df.to_csv(f"C:/{id}.csv")
print(df)
async def main():
client = CustomAioClient()
id_list = ["123", "456"]
tasks = []
for id in id_list:
task = asyncio.ensure_future(fetch(client=client, id=id))
tasks.append(task)
responses = await asyncio.gather(*tasks, return_exceptions=True)
# Close the session
await client.close_session()
if __name__ == "__main__":
loop = asyncio.new_event_loop()
loop.run_until_complete(main())
client.get_and_save(id), so that, well, the getting-and-saving is done within that same async task?