1

I'm looking for better ways to replace values inside a column with respect to certain rules.

My table look like this :

data    NB
1Y  1Yf
3Y  3Yf
4Y  4Yf
1M  1Mf
3M  3Mf
1Y  1Yf
3Y  3Yf
5Y  4Yf

Here's my code works but im looking for other ways to do it

def test(ls):
    n=0
    while n<len(ls):
        if ls[n]=='1M':
            ls[n]=0.083
            n=n+1
        elif ls[n]=='3M':
            ls[n]=0.25
            n=n+1
        elif ls[n]=='1Y':
            ls[n]=1
            n=n+1
        elif ls[n]=='3Y':
            ls[n]=3
            n=n+1
        elif ls[n]=='4Y':
            ls[n]=4
            n=n+1
        else:
            ls[n]='error'
            n=n+1
test(df['data'])

1
  • 1
    can you post your table directly here, so it is easier to look at it. Commented Aug 19, 2019 at 18:33

4 Answers 4

3

Using map

df['data'] = df['data'].map({'1M': 0.083, '3M': 0.25, '1Y':  1, '3Y': 3, '4Y': 4).fillna('error')

Using np.select

df['data'] = np.select([df.data.eq('1M'), df.data.eq('3M'), df.data.eq('1Y'), df.data.eq('3Y'), df.data.eq('4Y')],
                       [0.083, 0.25, 1, 3, 4],
                        default='error')

A generalized way is to use timedelta to annualize your operations for you, taking advantage of vectorization

df.data.transform(lambda x: np.timedelta64(x[0], x[1])) /np.timedelta64('1', 'Y') / np.timedelta64('1', 'Y')

Demonstration of how it works:

>>> np.timedelta64('3', 'M')/np.timedelta64('1', 'Y')
0.25

>>> np.timedelta64('1', 'M')/np.timedelta64('1', 'Y')
0.083333333
Sign up to request clarification or add additional context in comments.

Comments

1

You have two options here, one explicit, and one more general. The first option is using map to explicitly define your relationships and then filling null values with your else clause.

>>> d = {'1M': 0.083, '3M': 0.25, '1Y': 1, '3Y': 3, '4Y': 4 }
>>> df['data'].map(d).fillna('error')
0        1
1        3
2        4
3    0.083
4     0.25
5        1
6        3
7    error
Name: data, dtype: object

However, it seems like you have a fairly well defined rule here. It seems that if the letter in the first column is a Y, you want the number preceding, and if the letter is M, you want the number divided by 12.

You can generalize this condition to avoid having to have an explicit dictionary.


i = df['data'].str.extract(r'(\d+)')[0].astype(int)
j = df['data'].str.endswith('Y')
k = df['data'].str.endswith('M')

conditions = [
    (i < 5) & j,
    (i < 5) & k
]

pd.Series(np.select(conditions, [i, i/12], 'error'))

0                      1
1                      3
2                      4
3    0.08333333333333333
4                   0.25
5                      1
6                      3
7                  error

Comments

0

You can directly put a condition on columns and use that to replace values.

d = {'1M':0.083,'3M':0.25,'1Y':1,'4Y':4}
for k,d in d.iteritems():
    df['data'][df['data'] == k] = d

Comments

0

Please have a look at the replace method. We can use nested replace functions. Remember to have inplace = True for making changes inplace.


df.replace(
        to_replace=None,
        value=None,
        inplace=False,
        limit=None,
        regex=False, 
        method='pad',
        axis=None)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.