0

I have a data set in excel. A sample of the data is given below. Each row contains a number of items; one item in each column. The data has no headers either.

a b a d

g z f d a

e

dd gg dd g f r t

want to create a table which should look like below. It should count the items in each row and display the count by the row. I dont know apriori how many items are in the table.

row# a b d g z f e dd gg r t

1 2 1 1 0 0 0 0 0 0 0 0

2 1 0 1 1 1 1 0 0 0 0 0

3 0 0 0 0 0 0 1 0 0 0 0

4 0 0 0 1 0 1 0 2 1 1 1

I am not an expert in python and any assistance is very much appreciated.

0

1 Answer 1

1

Use get_dummies + sum:

df = pd.read_csv(file, names=range(100)).stack() # setup to account for missing values
df.str.get_dummies().sum(level=0)

   a  b  d  dd  e  f  g  gg  r  t  z
0  2  1  1   0  0  0  0   0  0  0  0
1  1  0  1   0  0  1  1   0  0  0  1
2  0  0  0   0  1  0  0   0  0  0  0
3  0  0  0   2  0  1  1   1  1  1  0
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the reply. Unfortunately, the code is giving me an error. The error is as follows "AttributeError: 'list' object has no attribute 'stack'".
@M.Nair I was missing a closing bracket. It should’ve thrown a syntax error. Check my edit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.