Quantcast
Channel: Active questions tagged selenium - Stack Overflow
Viewing all articles
Browse latest Browse all 99395

Infinite Scroll in Instagram comments for Web Scraping in Python

$
0
0

I am trying to build a scraper that is saving the comments under an Instagram post. I manage to log in to the instagram through my code so I can access all comments under a post, but I seem to cannot scroll down enough times to view all comments in order to scrape all of them. I only get around 20 comments everytime.

Can anyone please help me? I am using selenium webdriver.

Thank you for your help in advance! Will be greatfull.

This is my function for saving the comments:

import time
from selenium.webdriver.firefox.options import Options
from selenium.webdriver import Firefox
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

    def get_comments(self, url):
        self.browser.get(url)
        time.sleep(3)


        while True:
        try:
            self.load_more_comments = self.browser.find_element_by_class_name(
                    'glyphsSpriteCircle_add__outline__24__grey_9')
            self.action = ActionChains(self.browser)
            self.action.move_to_element(self.load_more_comments)
            self.load_more_comments.click()
            time.sleep(4)
            self.body_elem = self.browser.find_element_by_class_name('Mr508')
            for _ in range(100):
                self.body_elem.send_keys(Keys.END)
                time.sleep(3)
        except Exception as e:
            pass

        time.sleep(5)
        self.comment = self.browser.find_elements_by_class_name('gElp9 ')
        for c in self.comment:
            self.container = c.find_element_by_class_name('C4VMK')
            self.name = self.container.find_element_by_class_name('_6lAjh').text
            self.content = self.container.find_element_by_tag_name('span').text
            self.content = self.content.replace('\n', '').strip().rstrip()
            self.time_of_post = self.browser.find_element_by_xpath('//a/time').get_attribute("datetime")
            self.comment_details = {'profile name': self.name, 'comment': self.content, 'time': self.time_of_post}
            print(self.comment_details)
            time.sleep(5)

        return self.comment_details

Viewing all articles
Browse latest Browse all 99395

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>