I am having a pandas DataFrame where B contains NumPy list of fixed size.
|------|---------------|-------|
| A | B | C |
|------|---------------|-------|
| 0 | [2,3,5,6] | X |
|------|---------------|-------|
| 1 | [1,2,3,4] | X |
|------|---------------|-------|
| 2 | [2,3,6,5] | Y |
|------|---------------|-------|
| 3 | [2,3,2,3] | Y |
|------|---------------|-------|
| 4 | [2,3,4,4] | Y |
|------|---------------|-------|
| 5 | [2,3,5,6] | Z |
|------|---------------|-------|
I want to group these by columns 'C' and calculate the average of values of 'B' as list. As the table given below. I want to do this efficiently.
|----------------|-------|
| B | C |
|----------------|-------|
| [1.5,2.5,4,5] | X |
|----------------|-------|
| [2,3,4,4] | Y |
|----------------|-------|
| [2,3,5,6] | Z |
|----------------|-------|
I have considered breaking the NumPy list into individual columns. But that would be my last option.
How to write a custom aggregate function as right now column B is showing non-numeric and showing
DataError: No numeric types to aggregate