1

I have pandas dataframe A that looks like :

    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-04    46992.22
    2008-01-07    46491.28
    2008-01-08    45347.72
    2008-01-09    45681.68
    2008-01-10    46430.5

Where the date column is the index. I also have an numpy array B of the same length which has element -1, 0 and 1. What is the cleanest way to split the dataframe A into 3 dataframes such that the rows with equal corresponding B elements are grouped together. For eg. if B = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) then the dataframe should be split into :

    X
    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-10    46430.5

    Y
    2008-01-04    46992.22
    2008-01-07    46491.28

    Z
    2008-01-08    45347.72
    2008-01-09    45681.68

1 Answer 1

1

It's easy to utilize groupby from pandas, then you have the option to keep them grouped so you're not doubling your data. But you can always assign then

import numpy as np
import pandas as pd
import io

data = """    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-04    46992.22
    2008-01-07    46491.28
    2008-01-08    45347.72
    2008-01-09    45681.68
    2008-01-10    46430.5"""

df = pd.read_csv(io.StringIO(data), delimiter='\s+', header=None)
B = np.array([0, 0, 0, 1, 1, -1, -1, 0])

df['B'] = B

df_groups = df.groupby(['B'])

x = df_groups.get_group((0))
y = df_groups.get_group((-1))
z = df_groups.get_group((1))

The 0,-1,1 are the names based on the B value.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.