I have a text file with column header & data. I am trying to convert this file data into pandas DataFrame.
File:
#Columns: TargetDoc|GRank|LRank|Priority|Loc ID
aaaaa|1|1|Slow|8gkahinka.01
aaaaa|1|0|Slow|7nlafnjbaflnbja.01
I wrote below code: Firstly, I converted each line and trying list to convert Dataframe:
import os
import pandas as pd
with open("DocID101_201604070523.txt") as raw_file:
full_file_text = raw_file.readlines()
raw_file.close()
data_list = list()
for l in full_file_text:
if i.startswith('#'):
labels = l.strip().replace('#Columns: ','').split('|')
else:
data_list += l.strip().split('|')
df = PD.DataFrame.from_records(data_list,columns=labels)
But I got error on df:
AssertionError: 5 columns passed, passed data had 10 columns.
What's wrong with my code or is there any better way convert to dataframe ?
pd.read_csv('file.txt', sep='|')?#Columns:.