0

I have two data frames with a similar shape to:

df1 = pd.DataFrame([[3.2,5.8,46],[3.5,4.4,50],[5.4,6.7,40]], index = ['sample1','sample2','sample3'], columns = ['L1','L2','L3'])


L1  L2  L3
sample1 3.2 5.8 46
sample2 3.5 4.4 50
sample3 5.4 6.7 40


df2 = pd.DataFrame([[0.02,0.03,0.04,0.05,0.06],[0.2, 0.3, 0.4, 0.5, 0.7],[2, 3, 4, 5, 7]])


0   1   2   3   4
0   0.02    0.03    0.04    0.05    0.06
1   0.20    0.30    0.40    0.50    0.70
2   2.00    3.00    4.00    5.00    7.00


I would like to multiply the first row in df2 by the L1 value for sample 1 (3.2) in df1, then multiply the second row in df2 by the L2 value for sample 1 (5.8)in df1 and then multiply the third row in df2 by the L3 value for sample 1 (46) in df1. I would then need to repeat this for sample 2 (e.g., row 1 by the L1 value for sample2, row 2 by the L2 value for sample2, and row3 by the L3 value for sample2.) And so on for each sample (with my actual dataset I have 100s of samples). With the creation of a new dataframe either for each sample or for all of the samples as the output. I'm not sure how to set the relevant code up?

2 Answers 2

1

Please check the following code

column_list = df1.columns
sample_list = df1.index

# Loop over samples and columns 
new_df = pd.DataFrame()
for sample in sample_list:
    for ind, column in enumerate(column_list):
        multiply_by_sample = df2.iloc[ind] * df1.loc[sample][column]
        new_df = new_df.append(multiply_by_sample, ignore_index=True)

Sign up to request clarification or add additional context in comments.

1 Comment

If this is an answer you are looking for, I can provide details (if it is needed)
0

Something like this,

sample_lists = {}
for df1_index, df1_row in df1.iterrows():
    sample = df1_index
    print(f'\nPROCESSING SAMPLE {sample}')
    df1_row = df1_row.tolist()
    sample_list = []
    for value in df1_row:
        index_number = df1_row.index(value)
        df2_row = df2.iloc[index_number, :].tolist()
        print(f'Mulitplying {df2_row} with {value}')
        int_list = [v*value for v in df2_row]
        sample_list.append(int_list)
    sample_lists[sample] = sample_list
print(f'\nFINAL OUTPUT: {sample_lists}')

Feel free to remove the print statements. You can then use this dict to create a dataframe.

Explanation:

  • Start loop
  • Take the first row in df1 and convert that to a list
  • For each value in that list, get the index of the value. This is done so that you can get the row that matches the index in df2 which will be our next step.
  • Get the row that matches the index in df2
  • Multiply the row with the value and append it to a list
  • Create a dict with the index of each row in df1 (sample1, sample2, etc.)

Pretty certain you can use lambda and apply to simplify the code above.

2 Comments

where does samples_lists come from?
Oops sorry. Thanks for catching that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.