1

I have a dataframe that is the result of a pivot table that has columns:

(best buy, count)                                       753  non-null values
(best buy, mean)                                        753  non-null values
(best buy, min)                                         753  non-null values
(best buy, max)                                         753  non-null values
(best buy, std)                                         750  non-null values
(amazon, count)                                         662  non-null values
(amazon, mean)                                          662  non-null values
(amazon, min)                                           662  non-null values
(amazon, max)                                           662  non-null values
(amazon, std)                                           661  non-null values

If I send this to a csv file I end up with something that looks like this (truncated)

        (best buy, count) (best buy, mean) (best buy, max)
laptop         5                 10               12
tv             10                23               34

and so on and so forth.

Is there a way for me to manipulate the dataframe so that the csv that is created instead looks like the below?

             best buy          best buy         best buy
              count             mean             max
laptop         5                 10               12
tv             10                23               34

1 Answer 1

2

You can pass tupleize_cols=False to DataFrame.to_csv():

In [60]: df = DataFrame(poisson(50, size=(10, 2)), columns=['laptop', 'tv'])

In [61]: df
Out[61]:
   laptop  tv
0      48  57
1      48  45
2      48  49
3      61  47
4      49  47
5      45  65
6      49  40
7      58  39
8      46  65
9      43  53

In [62]: df['store'] = np.random.choice(['best_buy', 'amazon'], len(df))

In [63]: df
Out[63]:
   laptop  tv     store
0      48  57  best_buy
1      48  45  best_buy
2      48  49  best_buy
3      61  47  best_buy
4      49  47    amazon
5      45  65    amazon
6      49  40    amazon
7      58  39  best_buy
8      46  65    amazon
9      43  53  best_buy

In [64]: res = df.groupby('store').agg(['mean', 'std', 'min', 'max']).T

In [65]: res
Out[65]:
store        amazon  best_buy
laptop mean  47.250    51.000
       std    2.062     6.928
       min   45.000    43.000
       max   49.000    61.000
tv     mean  54.250    48.333
       std   12.738     6.282
       min   40.000    39.000
       max   65.000    57.000

In [66]: u = res.unstack()

In [67]: u
Out[67]:
store   amazon                    best_buy
          mean     std  min  max      mean    std  min  max
laptop   47.25   2.062   45   49    51.000  6.928   43   61
tv       54.25  12.738   40   65    48.333  6.282   39   57

In [68]: u.to_csv('the_csv.csv', tupleize_cols=False, sep='\t')

In [69]: cat the_csv.csv
store   amazon  amazon  amazon  amazon  best_buy        best_buy        best_buy        best_buy
        mean    std     min     max     mean    std     min     max

laptop  47.25   2.0615528128088303      45.0    49.0    51.0    6.928203230275509       43.0    61.0
tv      54.25   12.737739202856996      40.0    65.0    48.333333333333336      6.282250127674532       39.0    57.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.