0

I am trying to pull down ticket prices/information for a few baseball games but I am getting errors everytime I try and grab the data... Any idea what would be causing these for price, loc, and detail? I've also tried by XPATH with no success

games = ['https://seatgeek.com/dodgers-at-cubs-tickets/5-3-2021-chicago-illinois-wrigley-field/mlb/5316872', \
        'https://seatgeek.com/dodgers-at-cubs-tickets/5-5-2021-chicago-illinois-wrigley-field/mlb/5316885']

#gather ticket data
urls = []
location = []
prices = []
details = []

for g in games:
    try:
        driver.get(g)
        price = [i.text for i in WebDriverWait(driver, 100).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.Button__ButtonContents')))]
        print(price)
        loc = [i.text for i in WebDriverWait(driver, 100).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.ListingTicket__Section')))]
        print(loc)
        detail = [i.text for i in WebDriverWait(driver, 100).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.ListingTicket__Availability')))]
        print(detail)
        url = [str(g)] * len(price)
        urls.extend(url)
        prices.extend(price)
        location.extend(loc)
        details.extend(detail)
        print(str(g) + ": " + len(price) + " ")
    except:
        print('Failed: ' + str(g))
        pass
import requests
import pandas as pd

driver.get('https://seatgeek.com/chicago-cubs-tickets')
gameIds = [i.get_attribute('href') for i in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.EventItem__ItemLink-sc-14845pu-6')))]
gameIds = [x[-7:] for x in gameIds]

url = 'https://seatgeek.com/rescraper/v2/listings'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'}
writer = pd.ExcelWriter(final, engine='xlsxwriter')

tables = []
for gameId in gameIds:
    payload = {
    '_include_seats': '1',
    'client_id': 'MTY2MnwxMzgzMzIwMTU4',
    'id': '%s' %gameId,
    'sixpack_client_id': '93d1ab10-07dc-4482-bb89-b87c2b144e33'}
    
    jsonData = requests.get(url, headers=headers, params=payload).json()
    df = pd.json_normalize(jsonData['listings'])
    df.to_excel(writer, sheet_name=gameId)
    tables.append(df)
    print(gameId)

table = pd.concat(tables)

writer = pd.ExcelWriter(final, engine='xlsxwriter')
table.to_excel(writer, sheet_name='Tickets')
writer.save()
print('Done')

New Error:

HTTPSConnectionPool(host='seatgeek.com', port=443): Max retries exceeded with url: /rescraper/v2/listings?
_include_seats=1&client_id=MTY2MnwxMzgzMzIwMTU4&id=5316872&sixpack_client_id=93d1ab10-07dc-4482-bb89-b87c2b144e33 
(Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))

2 Answers 2

1

Just fetch that data from the api. As long as you have that id number. You may need to decipher what the columns mean, but seems pretty easy. You might want to add the date of the game too, otherwise all the data is there:

import requests
import pandas as pd

gameIds = [5316872, 5316885] 

url = 'https://seatgeek.com/rescraper/v2/listings'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'}

tables = []
for gameId in gameIds:
    payload = {
    '_include_seats': '1',
    'client_id': 'MTY2MnwxMzgzMzIwMTU4',
    'id': '%s' %gameId,
    'sixpack_client_id': '93d1ab10-07dc-4482-bb89-b87c2b144e33'}
    
    jsonData = requests.get(url, headers=headers, params=payload).json()
    df = pd.json_normalize(jsonData['listings'])
    tables.append(df)

Output:

Here's the first table (just showing the first 5 rows), but theres 265 rows in the first table. 455 in the other.

print(tables[0].head(10).to_string())
           dm    ep  et       f                                  gk      gr           id         ihd  dl  h  lv  vp                              mk         m  pu       p      pf  q  rp   r      rf  rr          ss     sdq  sgp     sgf    sif                        s                       sf   sr  sh    sco   sp spt      st  wc  sro  dq.b  dq.dq dq.ddq   dq.ev                                                                                                       d                                    fi   sg   sd
0  electronic  True   1    2.00          budweiser bleachers 515_19   85202   y5EMUx5j6Y               0  0   0   0  s:budweiser-bleachers-515 r:19  exchange   0   82.57   84.57  2   4  19  Row 19  19        None      []   64   20.57  False  budweiser bleachers 515  Budweiser Bleachers 515  515   0  False  [2]         pdf   0    0     1  74.62    7.8  146.55                                                                                                     NaN                                   NaN  NaN  NaN
1  electronic  True   1  209.00                       121_5_111:112  895002  kYetLw0ZN64  2021-05-02   0  0   0   0                       s:121 r:5  exchange   0  686.00  895.00  2   4   5   Row 5   5  [111, 112]  [5, 5]  686  209.00  False                      121              Section 121  121   0  False  [2]      mobile   0    0     5  15.73    2.1  433.27  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  9645a1de-66df-49b5-b637-5fa5c4736c41  NaN  NaN
2  electronic  True   1  156.45  budweiser bleachers 502_11_111:112  663002  lxVsqxleK85  2021-05-02   0  0   0   0  s:budweiser-bleachers-502 r:11  exchange   0  506.00  662.45  2   4  11  Row 11  11  [111, 112]  [6, 6]  506  156.45  False  budweiser bleachers 502  Budweiser Bleachers 502  502   0  False  [2]      mobile   0    0     6   2.84    0.5  117.89  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS                                   NaN  NaN  NaN
3  electronic  True   1  148.75                      129_13_111:112  631002  kYetLw0ZN2A  2021-05-02   0  0   0   0                      s:129 r:13  exchange   0  482.00  630.75  2   4  13  Row 13  13  [111, 112]  [6, 6]  482  148.75  False                      129              Section 129  129   0  False  [2]      mobile   0    0     6   4.63    0.7  166.99  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  f2d511b1-7b7f-4d84-b628-966fee6e8109  NaN  NaN
4  electronic  True   1  164.16                      218_10_111:112  695002  w3JsqE3VkKz  2021-05-02   0  0   0   0                      s:218 r:10  exchange   0  530.00  694.16  2   4  10  Row 10  10  [111, 112]  [6, 6]  530  164.16  False                      218              Section 218  218   0  False  [2]      mobile   0    0     6   3.56    0.6  166.48  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  a4904c72-fcc2-4342-b214-3283268cbbab  NaN  NaN
5  electronic  True   1  156.45                      218_15_111:112  663002  NrqUJbEl0YM  2021-05-02   0  0   0   0                      s:218 r:15  exchange   0  506.00  662.45  2   4  15  Row 15  15  [111, 112]  [6, 6]  506  156.45  False                      218              Section 218  218   0  False  [2]      mobile   0    0     6   3.70    0.6  155.66  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  a4904c72-fcc2-4342-b214-3283268cbbab  NaN  NaN
6  electronic  True   1  147.17                       229_9_111:112  621002  qVjH7eqn6jB  2021-05-02   0  0   0   0                       s:229 r:9  exchange   0  473.00  620.17  2   4   9   Row 9   9  [111, 112]  [6, 6]  473  147.17  False                      229              Section 229  229   0  False  [2]      mobile   0    0     6   2.73    0.4   77.54  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  4481eab0-396d-4696-bf67-950e33b45c5d  NaN  NaN
7  electronic  True   1  139.45                      229_13_111:112  589002  rVOH8wD9EP2  2021-05-02   0  0   0   0                      s:229 r:13  exchange   0  449.00  588.45  2   4  13  Row 13  13  [111, 112]  [6, 6]  449  139.45  False                      229              Section 229  229   0  False  [2]      mobile   0    0     6   3.01    0.5   74.55  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  4481eab0-396d-4696-bf67-950e33b45c5d  NaN  NaN
8  electronic  True   1  132.75                      229_17_111:112  557002  jDvsErZMO59  2021-05-02   0  0   0   0                      s:229 r:17  exchange   0  424.00  556.75  2   4  17  Row 17  17  [111, 112]  [6, 6]  424  132.75  False                      229              Section 229  229   0  False  [2]      mobile   0    0     6   3.33    0.5   71.82  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  4481eab0-396d-4696-bf67-950e33b45c5d  NaN  NaN
9  electronic  True   1  148.75                      218_20_111:112  631002  3q7fvGgbAwB  2021-05-02   0  0   0   0                      s:218 r:20  exchange   0  482.00  630.75  2   4  20  Row 20  20  [111, 112]  [6, 6]  482  148.75  False                      218              Section 218  218   0  False  [2]      mobile   0    0     6   3.90    0.6  145.81  TMX XFER MOBILE ENTRY. Scan your tickets from your mobile phone for this event. MOBILE ENTRY NO SPLITS  a4904c72-fcc2-4342-b214-3283268cbbab  NaN  NaN
Sign up to request clarification or add additional context in comments.

9 Comments

^ This is a much better/ more efficient solution for this use-case than mine. I didn't realize there was a scrolling element and hundreds of records to deal with.
@chitown88 what are the odds that you know of something similar for stubhub?
@RCarmody, same sort of layout as seatgeeks. If you get the eventid, you get basically all the same info.
Do you have any references where I could find the actual code? I can also ask another question if you have it
I didn’t write up the code, just looked at the site. I won’t be able to do it today, but can do it for you tomorrow morning. You looking for cubs home games? Any particular dates? Or just all dates?
|
1

You can use these for those elements:

price = [i.text for i in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@data-test='event-listing']//a/span")))]
price = [x.replace('\n', '') for x in price] #added to get rid of newline character in each list element
print(price)
loc = [i.text for i in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@data-test='event-listing']//div[@data-test='section']")))]
print(loc)
detail = [i.text for i in WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@data-test='event-listing']//span[@data-test='quantity']")))]
print(detail)


['$26/ea', '$112/ea', '$27/ea', '$122/ea', '$101/ea', '$88/ea', '$35/ea', '$38/ea']
['424 Right · Row 6', 'Section 113 · Row 1', '420 Right · Row 9', 'Section 114 · Row 1', 'Section 109 · Row 3', 'Section 110 · Row 13', '421 Right · Row 7', '421 Right · Row 6']
['2 tickets', '4 tickets', '2 tickets', '4 tickets', '4 tickets', '4 tickets', '2 tickets', '2 tickets']
...

I added another list comprehension for price to get rid of the newline character showing up in each string

One more fix you need:

Change this:

print(str(g) + ": " + len(price) + " ")

To this:

print(str(g) + ": " + str(len(price)) + " ")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.