I'm trying to web scrape in python with selenium. The website is a sport results page, and my final goal is to get in a csv (or xml in a future) the full list of results. The website code is something like this:
<div class="sportName soccer">
<div class="event__header">
<div class="event_title">
<div class="event_titleBox">
<span class="event_title--type">"Country"</span>
<span class="event_title--name">"Competition"</span>
</div>
</div>
</div>
<div class="event_round">Day 1</div>
<div class="event_match">Match 1</div>
<div class="event_match">Match 2</div>
<div class="event_match">Match 3</div>
<div class="event_round">Day 2</div>
<div class="event_match">Match 1</div>
<div class="event_match">Match 2</div>
<div class="event_match">Match 3</div>
</div>
It shows up like this:
Country Competition
Day 1
Match 1
Match 2
Match 3
Day 2
Match 1
Match 2
Match 3
My problem is when I try to get the info, I'm unable to get the whole information in the same variable, I'm using
results = driver.find_elements_by_xpath("//*[@class='sportName soccer']//*[@class='event__header']")
And this gets all the info but all in one single line. How can I get both "event_titleBox", "event_round" and "event_match scraped in order, in the same variable?
I can do by scraping the classes one each in a single variant, but then I get all the info messy and I don't know how to put in the correct order after that...
There is a way to pick up more than one class in the find_elements_by_xpath?
Many thanks