I am querying an API that lets you request n# of items in a single API call. So I am breaking up the list of items I am querying into n# of "sublists", passing them to a function which returns the API data, and then concatenating the data to a Dataframe.
But when I loop through the "sublists", the final Dataframe only contains the last "sublist", rather than every "sublist". So instead of:
netIncome sharesOutstanding
BRK.B 20 40
V 50 60
MSFT 30 10
ORCL 12 24
AMZN 33 55
GOOGL 66 88
I get:
netIncome sharesOutstanding
AMZN 33 55
GOOGL 66 88
Here is the full code, so can someone tell me what I'm doing wrong?
import os
from iexfinance.stocks import Stock
import pandas as pd
# Set IEX Finance API Token (Public Sandbox Version)
os.environ['IEX_API_VERSION'] = 'iexcloud-sandbox'
os.environ['IEX_TOKEN'] = 'XXXXXX'
def fetch_company_info(group):
"""Function to query API data"""
batch = Stock(group, output_format='pandas')
# Get income from last 4 quarters, sum it, and store to temp Dataframe
df_income = batch.get_income_statement(period="quarter", last='4')
df_income = df_income.T.sum(level=0)
income_ttm = df_income.loc[:, ['netIncome']]
# Get number of shares, and store to temp Dataframe
df_shares = batch.get_key_stats(period="quarter")
shares_outstanding = df_shares.loc['sharesOutstanding']
return income_ttm, shares_outstanding
# Full list to query via API
tickers = ['BRK.B', 'V', 'MSFT', 'ORCL', 'AMZN', 'GOOGL']
# Chunk ticker list into n# of lists
n = 2
batch_tickers = [tickers[i * n:(i + 1) * n] for i in range((len(tickers) + n - 1) // n)]
# Loop through each chunk of tickers
for group in batch_tickers:
company_info = fetch_company_info(group)
output_df = pd.concat(company_info, axis=1, sort='true')
print(output_df)
company_infowith nothing in every loop, your dataframe will only contain the results from the last loop. You should start with an empty DataFrame and then appendtickerlist is going to contain thousands of items, I'd like to make it as performant as possible.company_info. How's that even possible?