I want to read a text file that contains test results in a single column fashion (each line has one test case) and convert it to a CSV file with multiple columns where the columns are the name of the person who took the test with their results in their column.
The column headers in the CSV file will be: "Matt Test, Mark Test, John Test, Mike Test"
Under each persons column, they will have their results from slowest to fastest time. For example under "Matt Test" he will have 3 rows of trl_matt_test and the six rows of get_trl_time, "Mark Test will have 2 rows of trl_mark_test and 3 rows of get_trl_time etc... the results will generate different number of results each time so I can't hard code the number of rows.
testdata.txt (this is the text file data that I am reading):
trl_matt_test: 15s
trl_matt_test: 10s
trl_matt_test: 12s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_mark_test: 13s
trl_mark_test: 20s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_john_test: 20s
trl_john_test: 25s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_mike_test: 2s
get_trl_time: 1s
get_trl_time: 1s
# I want to use pandas and data frame if possible
import pandas as pd
# These are the headers I want to use for the columns in the CSV
header_list = ['Matt Test', 'Mark Test', 'John Test', 'Mike Test']
# I want to use a substring of the test name as a delimiter of where to split off
delimiter_list = ['mark', 'John', 'mike']
# I want to put the row number where the delimiter is to know how many rows
# of data each person has
delimiter_row_nums = []
# the idea behind this is I can know that Matts test are from rows 0-15 and that Marks
# test are from rows 16-20 etc... this
# is just an example but then I can create a list for Matts data [0:15] then a list of
# Marks data [16:20] etc...
# read the file in as a CSV using pandas and save the read file to the data_file
data_file = pd.read_csv("testdata.txt", header = header_list)
# use a count to get the row number needed
count = 1
# for each element in the delimiter list
for delim in delimiter_list:
# for each row or line in the file
for row in data_file:
# if an element in delimiter list is a substring of a row/line in the data file
if row.find(delim) != -1:
# take the new list and sort them then place them under their respected headers