Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 97778

Web scraping more than one "brother"

$
0
0

I'm trying to web scrape in python with selenium. The website is a sport results page, and my final goal is to get in a csv (or xml in a future) the full list of results. The website code is something like this:

<div class="sportName soccer">
    <div class="event__header">
        <div class="event_title">
            <div class="event_titleBox">
            <span class="event_title--type">"Country"</span>
            <span class="event_title--name">"Competition"</span>
            </div>
        </div>
    </div>
    <div class="event_round">Day 1</div>
    <div class="event_match">Match 1</div>
    <div class="event_match">Match 2</div>
    <div class="event_match">Match 3</div>
    <div class="event_round">Day 2</div>
    <div class="event_match">Match 1</div>
    <div class="event_match">Match 2</div>
    <div class="event_match">Match 3</div>
</div>

It shows up like this:

Country Competition
Day 1
Match 1
Match 2
Match 3
Day 2
Match 1
Match 2
Match 3

My problem is when I try to get the info, I'm unable to get the whole information in the same variable, I'm using

results = driver.find_elements_by_xpath("//*[@class='sportName soccer']//*[@class='event__header']")

And this gets all the info but all in one single line. How can I get both "event_titleBox", "event_round" and "event_match scraped in order, in the same variable?

I can do by scraping the classes one each in a single variant, but then I get all the info messy and I don't know how to put in the correct order after that...

There is a way to pick up more than one class in the find_elements_by_xpath?

Many thanks


Viewing all articles
Browse latest Browse all 97778

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>