0

I got a pretty straightforward problem and there must be a simple way to solve such problem. Consider the following dataframe:

import pandas as pd
df = pd.DataFrame()
start = pd.Timestamp('2013-08-14T00:00')
end = pd.Timestamp('2013-08-15T00:00')
t = np.linspace(start.value, end.value, 60*60*24+1)
df['Timestamp'] = pd.to_datetime(t)

Now I want to create one column df['Action'] which is a boolean, signalling true for intervals of 5s. So as outcome, I expect something like this:

            Timestamp          Action      
0     2013-08-14 00:00:00        False          
1     2013-08-14 00:00:01        False  
2     2013-08-14 00:00:02        False
3     2013-08-14 00:00:03        False
4     2013-08-14 00:00:04        False
5     2013-08-14 00:00:05        True
6     2013-08-14 00:00:06        False
7     2013-08-14 00:00:07        False
8     2013-08-14 00:00:08        False
9     2013-08-14 00:00:09        False
10     2013-08-14 00:00:10       True
11     2013-08-14 00:00:11       False

Yes, I could play with the index however that doesn't seem really elegant. I also want to be able to adjust the interval for different inputs.

Hope that I managed to be succinct and precise. I would really appreciate your help on this one!

1
  • So it is always one second intervals between consecutive rows? Commented Nov 27, 2019 at 13:29

2 Answers 2

2

use Series.dt.second and check the rest of the division with 5, this is faster, see comparison of times:

df['Action']=(df['Timestamp'].dt.second % 5).eq(0)
print(df.head(21))

Output

             Timestamp  Action
0  2013-08-14 00:00:00    True
1  2013-08-14 00:00:01   False
2  2013-08-14 00:00:02   False
3  2013-08-14 00:00:03   False
4  2013-08-14 00:00:04   False
5  2013-08-14 00:00:05    True
6  2013-08-14 00:00:06   False
7  2013-08-14 00:00:07   False
8  2013-08-14 00:00:08   False
9  2013-08-14 00:00:09   False
10 2013-08-14 00:00:10    True
11 2013-08-14 00:00:11   False
12 2013-08-14 00:00:12   False
13 2013-08-14 00:00:13   False
14 2013-08-14 00:00:14   False
15 2013-08-14 00:00:15    True
16 2013-08-14 00:00:16   False
17 2013-08-14 00:00:17   False
18 2013-08-14 00:00:18   False
19 2013-08-14 00:00:19   False
20 2013-08-14 00:00:20    True

If you want set to False the first value:

df.at[0,'Action']=False

time comparison

%%timeit
df['Action']=(df['Timestamp'].dt.second%5).eq(0)
10.8 ms ± 99.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
dt_range = pd.date_range(df['Timestamp'].iloc[0], 
                         df['Timestamp'].iloc[-1], 
                         freq='5s')
df['Action'] = df['Timestamp'].isin(dt_range)
23.9 ms ± 7.41 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sign up to request clarification or add additional context in comments.

Comments

2

You can use pd.date_range to create a list of all values you want to map to True:

dt_range = pd.date_range(df['Timestamp'].iloc[0], 
                         df['Timestamp'].iloc[-1], 
                         freq='5s')
df['Action'] = df['Timestamp'].isin(dt_range)
print(df.head(12))
             Timestamp  Action
0  2013-08-14 00:00:00    True
1  2013-08-14 00:00:01   False
2  2013-08-14 00:00:02   False
3  2013-08-14 00:00:03   False
4  2013-08-14 00:00:04   False
5  2013-08-14 00:00:05    True
6  2013-08-14 00:00:06   False
7  2013-08-14 00:00:07   False
8  2013-08-14 00:00:08   False
9  2013-08-14 00:00:09   False
10 2013-08-14 00:00:10    True
11 2013-08-14 00:00:11   False

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.