1

I want to read a text file that contains test results in a single column fashion (each line has one test case) and convert it to a CSV file with multiple columns where the columns are the name of the person who took the test with their results in their column.

The column headers in the CSV file will be: "Matt Test, Mark Test, John Test, Mike Test"

Under each persons column, they will have their results from slowest to fastest time. For example under "Matt Test" he will have 3 rows of trl_matt_test and the six rows of get_trl_time, "Mark Test will have 2 rows of trl_mark_test and 3 rows of get_trl_time etc... the results will generate different number of results each time so I can't hard code the number of rows.

testdata.txt (this is the text file data that I am reading):

trl_matt_test: 15s
trl_matt_test: 10s
trl_matt_test: 12s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_mark_test: 13s
trl_mark_test: 20s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_john_test: 20s
trl_john_test: 25s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
get_trl_time: 1s
trl_mike_test: 2s
get_trl_time: 1s
get_trl_time: 1s

# I want to use pandas and data frame if possible
import pandas as pd

# These are the headers I want to use for the columns in the CSV
header_list = ['Matt Test', 'Mark Test', 'John Test', 'Mike Test']

# I want to use a substring of the test name as a delimiter of where to split off   
delimiter_list = ['mark', 'John', 'mike']

# I want to put the row number where the delimiter is to know how many rows
# of data each person has
delimiter_row_nums = []

# the idea behind this is I can know that Matts test are from rows 0-15 and that Marks
# test are from rows 16-20 etc... this
# is just an example but then I can create a list for Matts data [0:15] then a list of 
# Marks data [16:20] etc...    

# read the file in as a CSV using pandas and save the read file to the data_file
data_file = pd.read_csv("testdata.txt", header = header_list)

# use a count to get the row number needed
count = 1

# for each element in the delimiter list
for delim in delimiter_list:
    # for each row or line in the file
    for row in data_file:
        # if an element in delimiter list is a substring of a row/line in the data file 
        if row.find(delim) != -1:

# take the new list and sort them then place them under their respected headers

1 Answer 1

2

It's not terribly clear what you are looking for, but this may get you started. I created a txt file with data you provided.

df = pd.read_csv('testdata.txt', header=0, names=['Results'])

# map the tester to the data
dd = df.Results.str.split('_', 1).str[1].str.split(':').str[0]
cmap = {'matt_test': 'Matt Test', 'mark_test': 'Mark Test', 'john_test': 'John Test', 'mike_test': 'Mike Test'}
df['Tester'] = dd.map(cmap).fillna(method='ffill') # not sure here if you want forward or back fill

# re-orient the data
df_pivot = df.pivot(columns=['Tester'])

                       Results
Tester           John Test           Mark Test           Matt Test          Mike Test
0                      NaN                 NaN  trl_matt_test: 10s                NaN
1                      NaN                 NaN  trl_matt_test: 12s                NaN
2                      NaN                 NaN    get_trl_time: 1s                NaN
3                      NaN                 NaN    get_trl_time: 1s                NaN
4                      NaN                 NaN    get_trl_time: 1s                NaN
5                      NaN                 NaN    get_trl_time: 1s                NaN
6                      NaN                 NaN    get_trl_time: 1s                NaN
7                      NaN                 NaN    get_trl_time: 1s                NaN
8                      NaN  trl_mark_test: 13s                 NaN                NaN
9                      NaN  trl_mark_test: 20s                 NaN                NaN
10                     NaN    get_trl_time: 1s                 NaN                NaN
11                     NaN    get_trl_time: 1s                 NaN                NaN
12                     NaN    get_trl_time: 1s                 NaN                NaN
13      trl_john_test: 20s                 NaN                 NaN                NaN
14       trl_john_test:25s                 NaN                 NaN                NaN
15        get_trl_time: 1s                 NaN                 NaN                NaN
16        get_trl_time: 1s                 NaN                 NaN                NaN
17        get_trl_time: 1s                 NaN                 NaN                NaN
18        get_trl_time: 1s                 NaN                 NaN                NaN
19        get_trl_time: 1s                 NaN                 NaN                NaN
20        get_trl_time: 1s                 NaN                 NaN                NaN
21        get_trl_time: 1s                 NaN                 NaN                NaN
22        get_trl_time: 1s                 NaN                 NaN                NaN
23        get_trl_time: 1s                 NaN                 NaN                NaN
24        get_trl_time: 1s                 NaN                 NaN                NaN
25                     NaN                 NaN                 NaN  trl_mike_test: 2s
26                     NaN                 NaN                 NaN   get_trl_time: 1s
27                     NaN                 NaN                 NaN   get_trl_time: 1s



# do a count
df_pivot.count()

         Tester
Results  John Test    12
         Mark Test     5
         Matt Test     8
         Mike Test     3
dtype: int64
Sign up to request clarification or add additional context in comments.

1 Comment

This helps a lot, I'm going to look at this more in detail and will get back to you. Thanks for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.