4

I am looking for something quite specific that I can't quite find the answer to.

I have two dataframes. One that contains an ID, latitude and longitude. The other has just an ID.

I want to store in a list the latitude and longitude as long as the ID from Dataframe A exists in Dataframe B using list comprehension. I can get the first part working fine, but matching the IDs appears to be causing a problem. This is what I have so far:

heat_data = [[row['latitude'],row['longitude']] for index, row in extract.iterrows() if row['NBN Location Id'] in closed['SP Order Location ID']]

To me, that says store 'latitude' and 'longitude' from extract as long as the ID exists in the other dataframe (closed). However this causes no data to be retrieved. Can anyone guide me as to where I'm going wrong? If I exclude the last 'if' statement, it works fine. So how else am I supposed to be doing this if statement?

Thanks!

2 Answers 2

3

I think list comprehension is not necessary, better and faster is use vectorized solution by filter by boolean indexing with isin and then convert to lists:

mask = extract['NBN Location Id'].isin(closed['SP Order Location ID'])
cheat_data = extract.loc[mask, ['latitude', 'longitude']].values.tolist()

Sample:

closed = pd.DataFrame({'SP Order Location ID':list('ace')})
print (closed)
  SP Order Location ID
0                    a
1                    c
2                    e

extract = pd.DataFrame({'NBN Location Id':list('abcde'),
                       'latitude':['lat1','lat2','lat3','lat4','lat4'],
                       'longitude':['long1','long2','long3','long4','long4']})

print (extract)
  NBN Location Id latitude longitude
0               a     lat1     long1
1               b     lat2     long2
2               c     lat3     long3
3               d     lat4     long4
4               e     lat4     long4

mask = extract['NBN Location Id'].isin(closed['SP Order Location ID'])
cheat_data = extract.loc[mask, ['latitude', 'longitude']].values.tolist()
print (cheat_data)
[['lat1', 'long1'], ['lat3', 'long3'], ['lat4', 'long4']]

Your solution failed, because in in pandas check index values, not values of Series, so need convert to list:

cheat_data = [[row['latitude'],row['longitude']] for index, row in extract.iterrows() 
              if row['NBN Location Id'] in closed['SP Order Location ID'].tolist()]
print (cheat_data)
[['lat1', 'long1'], ['lat3', 'long3'], ['lat4', 'long4']]

#changed index values
closed = pd.DataFrame({'SP Order Location ID':list('ace')}, index=list('dbw'))
print (closed)
  SP Order Location ID
d                    a
b                    c
w                    e

cheat_data = [[row['latitude'],row['longitude']] for index, row in extract.iterrows() 
              if row['NBN Location Id'] in closed['SP Order Location ID']]
print (cheat_data)

[['lat2', 'long2'], ['lat4', 'long4']]
Sign up to request clarification or add additional context in comments.

Comments

1

Using @jezrael's data

ids = {*closed['SP Order Location ID']}
cols = ['latitude', 'longitude', 'NBN Location Id']
[p for *p, i in zip(*map(extract.get, cols)) if i in ids]

[['lat1', 'long1'], ['lat3', 'long3'], ['lat4', 'long4']]

closed = pd.DataFrame({'SP Order Location ID':list('ace')})

extract = pd.DataFrame({'NBN Location Id':list('abcde'),
                       'latitude':['lat1','lat2','lat3','lat4','lat4'],
                       'longitude':['long1','long2','long3','long4','long4']})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.