Suppose I have the following DataFrames:
Containers:
Key ContainerCode Quantity
1 P-A1-2097-05-B01 0
2 P-A1-1073-13-B04 0
3 P-A1-2024-09-H05 0
5 P-A1-2018-08-C05 0
6 P-A1-2089-03-C08 0
7 P-A1-3033-16-H07 0
8 P-A1-3035-18-C02 0
9 P-A1-4008-09-G01 0
Inventory:
Key SKU ContainerCode Quantity
1 22-3-1 P-A1-4008-09-G01 1
2 2132-12 P-A1-3033-16-H07 55
3 222-12 P-A1-4008-09-G01 3
4 4561-3 P-A1-3083-12-H01 126
How do I update the Quantity values in Containers to reflect the number of units in each container based on the information in Inventory? Note that multiple SKUs can reside in a single ContainerCode, so we need to add to the quantity, rather than just replace it, and there may be multiple entries in Containers for a particular ContainerCode.
What are the possible ways to accomplish this, and what are their relative pros and cons?
EDIT
The following code seems to serve as a good test case:
import itertools
import pandas as pd
import numpy as np
inventory = pd.DataFrame({'Container Code':['A1','A2','A2','A4'],
'Quantity':[10,87,2,44],
'SKU':['123-456','234-567','345-678','456-567']})
containers = pd.DataFrame({'Container Code':['A1','A2','A3','A4'],
'Quantity':[2,0,8,4],
'Path Order':[1,2,3,4]})
summedInventory = inventory.groupby('Container Code')['Quantity'].sum()
print('Containers Data Frame')
print(containers)
print('\nInventory Data Frame')
print(inventory)
print('\nSummed Inventory List')
print(summedInventory)
print('\n')
newContainers = containers.drop('Quantity', axis=1). \
join(inventory.groupby('Container Code').sum(), on='Container Code')
print(newContainers)
This seems to produce the desired output.
I also tried using a regular merge:
pd.merge(containers.drop('Quantity', axis=1), \
summedInventory,how='inner',left_on='Container Code', right_index=True)
But that produces an 'IndexError: list index out of range'
Any ideas?
pd.merge(containers.drop('Quantity', axis=1), summedInventory,how='inner', left_on='Container Code', right_on='Container Code')producesKeyError: 'Container Code'The desire to use thepd.merge()comes from the possibility (actually the fact, in my real world application of this) that the inventory DF can have different column names than the container DF.pd.merge()is coming from the linesummedInventory = inventory.groupby('Container Code')['Quantity'].sum()If we delete the['Quantity']portion, it goes through just fine.