I have a file with multiple (over 1000) columns and rows, and their names do not follow any pattern. The example of it as in below:
file1.txt
IDs AABC ABC6 YHG.8 D78Ha
Ellie 12 48.70 33
Kate 98 34 21 76.36
Joe 22 53 49
Van 77 40 12.1
Xavier 88.85
First, I have to fill the blanks with NA, so that it will look like :
file1.txt
IDs AABC ABC6 YHG.8 D78Ha
Ellie 12 NA 48.70 33
Kate 98 34 21 76.36
Joe 22 53 49 NA
Van 77 NA 40 12.1
Xavier NA NA NA 88.85
Then, I am trying to get all combinations for IDs and other column as AABC, ABC6,YHG.8 and D78Ha, such as :
Ellie , AABC --> 12
Ellie, ABC6 --> NA
Ellie, YHG.8 --> 48.70 ( without rounding )
Ellie, D78Ha --> 33
Kate,AABC --> 98
Kate, ABC6 --> 34
...
So the desired output should be 20 lines (4 columns x 5 IDs) as following:
output.txt
Ellie AABC 12
Ellie ABC6 NA
Ellie YHG.8 48.70
Ellie D78Ha 33
Kate AABC 98
Kate ABC6 34
..
For this reason, I filled the blanks manually with NA, read file with pandas, and indexed the IDs.
So that I can reach with the ID names and other column names.
But I could not iterate it. My try was:
import pandas as pd
tablefile = pd.read_csv('file1.txt',sep='\t')
print(tablefile)
df2=tablefile.set_index("IDs")
print("Ellie AABC " , df2.loc["Ellie", "AABC" ])
print("Kate AABC " , df2.loc["Kate", "AABC" ])
print("Xavier AABC " , df2.loc["Xavier", "AABC" ])
It prints:
('Ellie AABC ', 12.0)
('Kate AABC ', 98.0)
('Xavier AABC ', nan)
How can I fill the blanks with NAs and iterate in this array without calling the names by writing it one by one? Maybe with increasing i in [i,i]?