Select specific columns

Question

I've a scientist dataframe

     radius      date     spin  atom
0    12,50       YYYY/MM   0     he
1    11,23       YYYY/MM   2     c
2    45,2        YYYY/MM   1     z
3    11,1        YYYY/MM   1     p

I want select for each row, all rows where the difference between the radius is under, for exemple 5

I've define a function to calc (simple,it's an example):

def diff_radius (a,b)
    return a-b

Is-it possible for each rows to find some rows which check the condition in calling an external function?

I try some way, not working:

for i in range(df.shape[0]):
     ....
     df_in_radius=df.apply(lambda x : diff_radius(df[i]['radius'],x['radius']))

Can you help me?

and could you precise the difference between the radius and 'something' is under 5? — Frenchy
– Frenchy, Commented Mar 1, 2019 at 9:08
sorry i 've got a mistake : df_in_radius=df.apply(lambda x : diff_radius(df[i]['radius'],x['radius']) < 5) i want for each row build a dataframe (with same columns) where difference between radius of loc[i] is under 5 — oxthon
– oxthon, Commented Mar 1, 2019 at 9:14
sorry, I have a global dataframe.For each row (call "i") I want to select, in the same dataframe rows whose difference with the radius of "i" is less than 5. This treatment is in a loop. "i" varies from 0 to the length of the dataframe. — oxthon
– oxthon, Commented Mar 1, 2019 at 9:28
please modify your question with all beginning hypotheses , because nobody understand....for example we have to guess you have a global dataframe — Frenchy
– Frenchy, Commented Mar 1, 2019 at 9:38

bumblebee · Accepted Answer · 2019-03-01 10:09:13Z

1

I am assuming that the datatype of the radius column is a tuple. You can keep the diff_radius method like

def diff_radius(x):
    a, b = x
    return a-b

Then, you can use loc method in pandas to select the rows which matches the condition of radius differece less than 5.

df.loc[df.radius.apply(diff_radius) < 5]

Edit #1

If the datatype of the radius column is a string, then split them and typecast. The logic will go in the diff_radius method. In case of string

def diff_radius(x):
    x_split = x.split(',')
    a,b = int(x_split[0]), int(x_split[-1])
    return a-b

edited Mar 1, 2019 at 10:09

answered Mar 1, 2019 at 9:24

bumblebee

1,84114 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

oxthon Over a year ago

col radius isn't a tuple. This is the value of my element radius. i put a comma not a point sorry.

bumblebee Over a year ago

@oxthon I have updated my answer for string. Please check

oxthon · Accepted Answer · 2019-03-01 12:21:25Z

I misspoke.

My dataframe is :

     radius of my atom      date     spin  atom
0    12.50                  YYYY/MM   0     he
1    11.23                  YYYY/MM   2     c
2    45.2                   YYYY/MM   1     z
3    11.1                   YYYY/MM   1     p

I do a loop , to apply on one row a special calcul of each row whose respond condition. Example:

def diff_radius(current_row,x):
    current_row['radius']-x['radius']
    return a-b

df=pd.read_csv(csvfile,delimiter=";",names=('radius','date','spin','atom'))
# for each row of original dataframe
for i in range(df.shape[0]):

      # first build a new and tmp dataframe with row
      # which have a radius less 5 than df.iloc[i]['radius] (level of loop)
      df_tmp=df[diff_radius(df.iloc[i]['radius],df['radius']) <5]
      ....
      # start of special calc, with the df_tmp which contains all of rows
      # less 5 than the current row **(i)**

I thank you sincerely for your answers

Collectives™ on Stack Overflow

Select specific columns

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related