ValueError: cannot reindex from a duplicate axis Pandas

Question

So I have a an array of timeseries` that are generated based on a fund_id:

def get_adj_nav(self, fund_id):
    df_nav = read_frame(
        super(__class__, self).filter(fund__id=fund_id, nav__gt=0).exclude(fund__account_class=0).order_by(
            'valuation_period_end_date'), coerce_float=True,
        fieldnames=['income_payable', 'valuation_period_end_date', 'nav', 'outstanding_shares_par'],
        index_col='valuation_period_end_date')
    df_dvd, skip = self.get_dvd(fund_id=fund_id)
    df_nav_adj = calculate_adjusted_prices(
        df_nav.join(df_dvd).fillna(0).rename_axis({'payout_per_share': 'dividend'}, axis=1), column='nav')
return df_nav_adj

def json_total_return_table(request, fund_account_id):
ts_list = []
for fund_id in Fund.objects.get_fund_series(fund_account_id=fund_account_id):
    if NAV.objects.filter(fund__id=fund_id, income_payable__lt=0).exists():
        ts = NAV.objects.get_adj_nav(fund_id)['adj_nav']
        ts.name = Fund.objects.get(id=fund_id).account_class_description
        ts_list.append(ts.copy())
        print(ts)
    df_adj_nav = pd.concat(ts_list, axis=1) # ====> Throws error
    cols_to_datetime(df_adj_nav, 'index')
    df_adj_nav = ffn.core.calc_stats(df_adj_nav.dropna()).to_csv(sep=',')

So an example of how the time series look like is this:

valuation_period_end_date
2013-09-03    17.234000
2013-09-04    17.277000
2013-09-05    17.363000
2013-09-06    17.326900
2013-09-09    17.400800
2013-09-10    17.473000
2013-09-11    17.486800
2013-09-12    17.371600
....
Name: CLASS I, Length: 984, dtype: float64

Another timeseries:

valuation_period_end_date
2013-09-03    17.564700
2013-09-04    17.608500
2013-09-05    17.696100
2013-09-06    17.659300
2013-09-09    17.734700
2013-09-10    17.808300
2013-09-11    17.823100
2013-09-12    17.704900
....
Name: CLASS F, Length: 984, dtype: float64

For each timeseries the Lengths are different and I am wondering if that is the reason for the error I am getting: cannot reindex from a duplicate axis. I am new to pandas so I was wondering if you guys have any advice.

Thanks

EDIT: Also the indexes aren't supposed to be unique.

Alexander · Accepted Answer · 2017-08-24 21:59:31Z

1

Perhaps something like this would work. I've added the fund_id to the dataframe and reindexed it to the valuation_period_end_date and fund_id.

# Only fourth line above error.
ts = (
    NAV.objects.get_adj_nav(fund_id['adj_nav']
    .to_frame()
    .assign(fund_id=fund)
    .reset_index()
    .set_index(['valuation_period_end_date', 'fund_id']))

And then stack with axis=0, group on the date and fund_id (assuming there is only one unique value per date and fund_id, you can take the first value), then unstack fund_id to pivot it as columns:

df_adj_nav = (
    pd.concat(ts_list, axis=0)
    .groupby(['valuation_period_end_date', 'fund_id'])
    .first()
    .to_frame()
    .unstack('fund_id'))

edited Aug 24, 2017 at 21:59

answered Aug 24, 2017 at 21:48

Alexander

111k32 gold badges212 silver badges208 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

anderish Over a year ago

Get an error: 'Series' object has no attribute 'assign'

Alexander Over a year ago

see edit above. Use to_frame() to convert to dataframe from series.

anderish Over a year ago

This seems to have worked but now I just want to change the name of the index. Currently the we have this fund_id 80 81 82 83 but I want CLASS A CLASS C CLASS F SERIES 1. Basically I tried something like this: ts.name = Fund.objects.get(id=fund_id).account_class_description but it didn't work.

anderish Over a year ago

The reason why I want to do this is because I want to convert the table to a csv format and I get this error: TypeError: sequence item 1: expected str instance, tuple found. Which I assume is because of the (fund_id 80 81 82 83)

Alexander Over a year ago

You can change the name when you do the to_frame command. to_frame("new name") should work.

|

Collectives™ on Stack Overflow

ValueError: cannot reindex from a duplicate axis Pandas

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related