1

I have dataframe consisting of more than 200 columns. I want to value_counts() in each column. Below is my code which is working fine but when I want to create "csv". The below code only enter the last column (value count). I want all.

import pandas as pd
df = pd.read_csv("hcp.csv")
for col in df:
    df2 = df[col].value_counts()
    print(df2)
df2.to_csv("new_hcp.csv")

The print(df2) is showing all value counts but not "CSV". Anyone who can help, I will be grateful.

2
  • 1
    You overwrite df2 on each iteration. That is why it only keeps the last value counts. Commented Sep 1, 2022 at 12:30
  • 1
    It is only showing the last because df2 gets overwritten in each iteration of the loop, so you only end up with the last value. Create an empty DF and append rows to it with your values in each iteration, then output that df Commented Sep 1, 2022 at 12:31

2 Answers 2

2

You can use an apply on the value_counts method to get all the values count by column :

import pandas as pd


df = pd.read_csv("hcp.csv")
df2 = df.apply(pd.Series.value_counts).unstack().to_frame().dropna().reset_index().rename(columns={'level_0': 'col_name', 'level_1': 'value_name', 0: 'count'})
df2.to_csv("new_hcp.csv", index=False)
Sign up to request clarification or add additional context in comments.

Comments

2

You are overwriting the value of df2 in each iteration. Create an empty list outside the loop, append the value of value_counts, then create a DF from that list and output it.

import pandas as pd
df = pd.read_csv("hcp.csv")
value_counts_list = []
for col in df:
    value_counts_list.append(df[col].value_counts())
    print(df2)
pd.DataFrame(value_counts_list).to_csv("new_hcp.csv")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.