Python Pandas replacing part of a string

Question

I'm trying to filter data that is stored in a .csv file that contains time and angle values and save filtered data in an output .csv file. I solved the filtering part, but the problem is that time is recorded in hh:mm:ss:msmsmsms (12:55:34:500) format and I want to change that to hhmmss (125534) or in other words remove the : and the millisecond part. I tried using the .replace function but I keep getting the KeyError: 'time' error.

Input data:

time,angle
12:45:55,56
12:45:56,89
12:45:57,112
12:45:58,189
12:45:59,122
12:46:00,123

Code:

import pandas as pd

#define min and max angle values
alpha_min = 110
alpha_max = 125

#read input .csv file
data = pd.read_csv('test_csv3.csv', index_col=0)

#filter by angle size
data = data[(data['angle'] < alpha_max) & (data['angle'] > alpha_min)]

#replace ":" with "" in time values
data['time'] = data['time'].replace(':','')

#display results
print data

#write results
data.to_csv('test_csv3_output.csv')

NYC Coder · Accepted Answer · 2020-05-21 13:22:42Z

2

That's because time is an index. You can do this and remove the index_col=0:

data = pd.read_csv('test_csv3.csv')

And change this line:

data['time'] = pd.to_datetime(data['time']).dt.strftime('%H%M%S')

Output:

     time  angle
2  124557    112
4  124559    122
5  124600    123

edited May 21, 2020 at 13:22

answered May 21, 2020 at 13:15

NYC Coder

7,6443 gold badges14 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user2882635 Over a year ago

the {index_col=0} was used to remove the extra first column that appears in the results (it also appears in your answer). I tried your code correction and still get the same error. Perhaps I should add I'm using python2.7 because of some compatibility issues with openCV.

user2882635 Over a year ago

Correction: your code works (almost - now I need to find a way to remove the added column in the output). I checked the print (data.keys()) as per Daniel B's answer and found out I had a mistake in the input file, which I did not recognize before. I was using the Notepad++ to check the .csv files which was not showing me the last version of the file, which is very strange.

user2882635 Over a year ago

found the answer, data.to_csv('test_csv3_output.csv',index=False) from stackoverflow.com/questions/26786960/…

Daniel B · Accepted Answer · 2020-05-21 13:19:41Z

1

What would print (data.keys()) or print(data.head()) yield? It seems like you have a stray character before\after the time index string, happens from time to time, depending on how the csv was created vs how it was read (see this question).

If it's not a bigger project and/or you just want the data, you could just do some silly workaround like: timeKeyString=list(data.columns.values)[0] (assuming time is the first one).

answered May 21, 2020 at 13:19

Daniel B

3224 silver badges15 bronze badges

2 Comments

user2882635 Over a year ago

I checked print (data.keys()) or print(data.head()) and the tips from the answer you linked and I don't have any stray characters. The code from NYC Coder works for me, but now I need to find a way to remove the added 0 column in the output file.

Daniel B Over a year ago

dataframes can't exist without the index column. You can, however, pass an argument for the export function to drop it in the output file: print(data.to_csv(sep='\t', index=False))

Collectives™ on Stack Overflow

Python Pandas replacing part of a string

2 Answers 2

3 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related