How to run a loop within a Pandas dataframe to append a column?

Question

I have a dataframe that is as follows:

    MID        POSITION
1   22596394       R8

2   22596394       R8 

3   22596394       R8

4   22591549       R6

5   22591549       R6

6   22591549       R6

Now I have another dataframe which will be the output after running some code which will look like the following:

Position     Usage
R1             0  
R2             0 
R3             0
R4             0
R5             0
R6             1
R7             0 
R8             1
L1             0
L2             0
L3             0 
...           
L8             0

I would like to fill out the Usage column according to the logic below:

Wherever MID changes, note the corresponding POSITION and fill the Usage row corresponding in the output dataframe, for eg: in the above dataframe, R8 and R6 Usage rows should be filled with 1 and the rest Position columns with 0. Similarly if MID changes twice for the same position say R6 for example the R6 Usage row should be filled with 2 and so on. What would be the best way to do this? Thanks in advance!

I've updated the output dataframe. To make it more clear let's say the MID changed 2 times when the positions was still say R6. Then the usage row corresponding to R6 should be filled with 2 and so on. Thanks! — Ruffy26
– Ruffy26, Commented Oct 20, 2016 at 8:13
Hmmm, but 'MID' is not changed in R6 nor in R8. It is 3 times same value. — jezrael
– jezrael, Commented Oct 20, 2016 at 8:19
Sorry I'm unable to make myself clear.Let's say rather MID should be unique and the position is noted. For example in the above table, Usage of R6 and R8 is 1 because it has only one unique MID. Hope that makes it clear. — Ruffy26
– Ruffy26, Commented Oct 20, 2016 at 8:21

jezrael · Accepted Answer · 2016-10-20 10:39:27Z

1

I think you need nunique and then reindex:

print (df1.groupby('POSITION')['MID'].nunique())
POSITION
R6    1
R8    1
Name: MID, dtype: int64

print (df1.groupby('POSITION')['MID']
          .nunique()
          .reindex(df2.set_index('Position').index, fill_value=0)
          .rename('Usage')
          .reset_index())
   Position  Usage
0        R1      0
1        R2      0
2        R3      0
3        R4      0
4        R5      0
5        R6      1
6        R7      0
7        R8      1
8        L1      0
9        L2      0
10       L3      0

Explanation:

For geting number of unique values per group need groupby by column POSITION and then aggreagate nunique on column MID. You get new Series with indexes R6 and R8. Then need add another values from df2 and column Position. So if values are unique, one posible solution is create index from column position by set_index and then reindex values in index of df1 by index of df2. Get some NaN, which are replaced by 0 (parameter fill_value=0). Then need create new column from index - first rename Series name by rename and last reset_index - get nice DataFrame.

edited Oct 20, 2016 at 10:39

answered Oct 20, 2016 at 8:25

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Ruffy26 Over a year ago

Shouldn't it be Usage instead of Position in (df2.set_index('Position').index, fill_value=0) given that I want to fill the Usage column?

Ruffy26 Over a year ago

Works as usual. Thanks again!

jezrael Over a year ago

Glad can help you! Nice day!

juanpa.arrivillaga Over a year ago

I don't think you want nunique rather something more like (df.POSITION[1:][~(df.MID.shift(1) == df.MID)[1:]]), given your description. You want the corresponding POSITION when MID changes... at least that is what you described at first, but then you said something about uniqueness...

juanpa.arrivillaga Over a year ago

Rather, something to the effect of :

(df.POSITION[1:][~(df.MID.shift(1) == df.MID)[1:]]).value_counts().reindex(['R1','R2','R3','R4','R5','R6'], fill_value=0)

or use the clever index trick jezrael used in this answer.

|

Collectives™ on Stack Overflow

How to run a loop within a Pandas dataframe to append a column?

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related