0

I want to write a code where it outputs the number of repeated values in a for each different value. Then I want to make a pandas data sheet to print it. The sums code down below does not work how would I be able to make it work and get the Expected Output?

import numpy as np
import pandas as pd 

a = np.array([12,12,12,3,43,43,43,22,1,3,3,43])
uniques = np.unique(a)
sums = np.sum(uniques[:-1]==a[:-1])

Expected Output:

Value    Repetition Count
1        1
3        3
12       3
22       1
43       4
1
  • sort the column, then use groupby().count(). Commented Sep 27, 2021 at 18:24

3 Answers 3

2

Define a dataframe df based on the array a. Then, use .groupby() + .size() to get the size/count of unique values, as follows:

a = np.array([12,12,12,3,43,43,43,22,1,3,3,43])
df = pd.DataFrame({'Value': a})

df.groupby('Value').size().reset_index(name='Repetition Count')

Result:

   Value  Repetition Count
0      1                 1
1      3                 3
2     12                 3
3     22                 1
4     43                 4

Edit

If you want also the percentages of counts, you can use:

(df.groupby('Value', as_index=False)
   .agg(**{'Repetition Count': ('Value', 'size'), 
           'Percent': ('Value', lambda x: round(x.size/len(a) *100, 2))})
)

Result:

   Value  Repetition Count  Percent
0      1                 1     8.33
1      3                 3    25.00
2     12                 3    25.00
3     22                 1     8.33
4     43                 4    33.33

or use .value_counts with normalize=True

pd.Series(a).value_counts(normalize=True).mul(100)

Result:

43    33.333333
12    25.000000
3     25.000000
22     8.333333
1      8.333333
dtype: float64
Sign up to request clarification or add additional context in comments.

4 Comments

besides Repetition count can I also place a percentage bar like for the value 43 it has occurred 4/12 so it is 33.3%
@georgehere Ok, give me a moment.
Thanks appreciate it
@georgehere See my edit with the percentages .
2

You can use groupby:

>>> pd.Series(a).groupby(a).count()
1     1
3     3
12    3
22    1
43    4
dtype: int64

Or value_counts():

>>> pd.Series(a).value_counts().sort_index()
1     1
3     3
12    3
22    1
43    4
dtype: int64

Comments

1

Easiest if you make a pandas dataframe from np.array and then use value_counts().

df = pd.DataFrame(data=a, columns=['col1'])

print(df.col1.value_counts())
43    4
12    3
3     3
22    1
1     1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.