Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 99046

ESPNCRICINFO API call

$
0
0

I had been scraping ESPN data using their publicly avaialable API on thehttps://hs-consumer-api.espncricinfo.com/ end points. Below is an example of one of the endpoints

v1/pages/match/scorecard?lang=en&seriesId=&matchId=These API's were being called on espncricinfo pages through AJAX/XHR requests

Recently they introduced a change in their pages where it seems javascript is being used to add additional header that act as authentication token

x-hsci-auth-token:exp=1727527924~hmac=9080f72a36f2b97ec94069dca3382981b2312fb10ec800ab75959a6777f344f2

Which has rendered the API inaccessible directly. I have tried to use seleniumwire to analyze requests using chrome driver and concluded that if I am able to extract the headers and invoke the API then it works.

The issue that I am facing is that when I access the required URL through selenium-wire these API calls don't happen. I can see all other calls to Adv sites and other tracking apis but consumer API calls don't happen.

Here is the code that I am using

from seleniumwire import webdriverfrom selenium.webdriver.chrome.service import Servicefrom selenium.webdriver.chrome.options import Optionsfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECimport requestsimport time# Set the path to your manually installed ChromeDriverchromedriver_path = 'chromedriver.exe'# Set up Chrome WebDriver with Selenium Wirechrome_options = Options()chrome_options.add_argument("--headless")  # Optional: to run browser in headless modechrome_options.add_argument("--disable-gpu")chrome_options.add_argument("--no-sandbox")chrome_options.add_argument('--ignore-certificate-errors')  # Ignore certificate errorschrome_options.add_argument('--disable-proxy-certificate-handler')# Selenium Wire options to disable SSL verificationseleniumwire_options = {'verify_ssl': False  # Disable SSL verification}# Manually specify the ChromeDriver pathdriver = webdriver.Chrome(service=Service(chromedriver_path), options=chrome_options, seleniumwire_options=seleniumwire_options)# Navigate to the target URLdriver.get('https://www.espncricinfo.com/series/germany-women-s-t20i-tri-series-2024-1444526/germany-women-vs-italy-women-final-1444537/full-scorecard')# Capture the cookies and convert them to a format usable by the requests librarytime.sleep(5)req_headers = {}req_url = ''# Capture and filter the XHR requests after the div is loadedfor request in driver.requests:    if 'scorecard' in request.url and 'xhr' in request.headers.get('X-Requested-With', '').lower() and request.response.status_code == 200:        req_url = request.url        # print(request.response.headers)        req_headers = request.headers        print("Found headers for the needed URL")        break# Close the WebDriver sessiondriver.quit()# Check if headers were successfully capturedif req_headers:    print("Request headers captured successfully.")else:    print("Failed to capture request headers.")    exit(1)try:    print(req_headers, req_url)    response = requests.get(req_url, headers=req_headers)    print(f"API Response Status: {response.status_code}")    if response.status_code == 200:        print("API call was successful!")        print(response)        print(response.json())  # Print the JSON response from the API    else:        print(f"API call failed with status code: {response.status_code}")        print(response.text)except Exception as e:    print(f"Error during API call: {str(e)}")

If I open the same URL in normal browser I can see the call happening to this URLhttps://hs-consumer-api.espncricinfo.com/v1/pages/match/scorecard?lang=en&seriesId=1444526&matchId=1444537

but not in the selenium-wire request.

  • What are the options that are available for me to solve this?
  • Why can I not see these requests happening in selenium-wire requests?

Viewing all articles
Browse latest Browse all 99046

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>