2

I have a dataframe with different sections (only 2 sections and speeds here, but a circuit can be up to 8 sections and 6 measured speeds) like so:

section speed Data1 Data2
A 10 1.5 2.5
A 20 1.0 2.0
B 10 2.5 3.5
B 20 2.0 3.0

I would like to sum my data columns over all possible circuits

A B Data1 Data2
10 10 4.0 6.0
10 20 3.5 5.5
20 10 3.5 5.5
20 20 3.0 5.0

How would I do this? I can make the combinations, but not sure how to sum the data columns over them.

4
  • Do you have only A and B? If more would you want all combinations? Commented Jul 19, 2022 at 15:01
  • Yes, that's where it would get difficult. You can have sections A-H and up to 6 different speeds for each. 2**2 is simple, 8**6 is not trivial :P Commented Jul 19, 2022 at 15:03
  • OK, I think I see Commented Jul 19, 2022 at 15:07
  • are you looking for an answer on how to convert table A into Table B, and then calculate the sum, or from table B calculate just calculate the sums? Also, please confirm if you are looking for an answer to consider all possible 8 sections and then calculate Data1 and Data2. Commented Jul 19, 2022 at 15:22

3 Answers 3

3

What about using itertools.product, then summing per group:

from itertools import product

df2 = df.set_index(['section', 'speed']).T

out = (pd.concat({k: df2[list(k)].sum(1)
                  for k in product(*(d for _,d in df2.groupby(axis=1, level=0)))})
         .unstack(level=-1)
      )

output:

                 Data1  Data2
(A, 10) (B, 10)    4.0    6.0
        (B, 20)    3.5    5.5
(A, 20) (B, 10)    3.5    5.5
        (B, 20)    3.0    5.0

For the exact provided format:

df2 = df.set_index(['section', 'speed']).T

sections = df2.columns.get_level_values('section').unique()

out = (pd.concat({tuple(x[1] for x in k):
                  df2[list(k)].sum(1)
                  for k in product(*(d for _,d in df2.groupby(axis=1, level=0)))
                 })
         .unstack(level=-1)
         .rename_axis(sections).reset_index()
      )

output:

    A   B  Data1  Data2
0  10  10    4.0    6.0
1  10  20    3.5    5.5
2  20  10    3.5    5.5
3  20  20    3.0    5.0
Sign up to request clarification or add additional context in comments.

2 Comments

Can't wrap my head around either answer, and both work, but the first multi-index output is slightly more useful for my application. Thanks!
Hi @mozway, can you help here stackoverflow.com/questions/73039690/…
3

One approach:

from itertools import product

groups = [[row for i, row in v.iterrows()] for _, v in df.groupby("section")]
rows = []
for p in product(*groups):
    row = {}
    for e in p:
        d = e.to_dict()
        row[d.pop("section")] = d.pop("speed")
        for k, v in d.items():
            row[k] = row.get(k, 0) + v
    rows.append(row)

res = pd.DataFrame(rows)
print(res)

Output

    A  Data1  Data2   B
0  10    4.0    6.0  10
1  10    3.5    5.5  20
2  20    3.5    5.5  10
3  20    3.0    5.0  20

Or more pythonic:

def build_row(prod):
    row = {}
    for e in prod:
        d = e.to_dict()
        row[d.pop("section")] = d.pop("speed")
        for k, v in d.items():
            row[k] = row.get(k, 0) + v
    return row


groups = [[row for i, row in v.iterrows()] for _, v in df.groupby("section")]
res = pd.DataFrame([build_row(p) for p in product(*groups)])
print(res)

Note that if you want exact output, just reorder the columns.

Comments

0

use pandasql package

df1.sql("""
    select a,b,tb1.data1+tb2.data1 as Data1,tb1.data2+tb2.data2 as Data2
        from
        (select speed as A,data1,data2 from self where section = 'A') tb1
        join
        (select speed as B,data1,data2 from self where section = 'B') tb2
""")

out:

    A   B  Data1  Data2
0  10  10    4.0    6.0
1  10  20    3.5    5.5
2  20  10    3.5    5.5
3  20  20    3.0    5.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.