Extract content from a continuously updating web page , using Python

I am trying to extract table data from the following page:

http://www.mfinante.gov.ro/patrims.html?adbAdbId=4283

The problem is the page seems to be constantly adding rows, dynamically, and using requests returns only the html without the table. I also tried to use selenium, to wait until the page loads fully (as number of rows is finite), but but selenium waits as page loads until the the browser runs out of memory and crashes (at about 100K rows).

My question is, how do i get the content being send to the page, maybe in chunks, and save it? Is there a way to simulate the call the browser is doing?

Here is what i have managed with selenium, which works for smaller samples (ex: adbAdbId=30):

import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

data = ''
delay = 800

options = webdriver.ChromeOptions()
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path="chromedriver.exe")
driver.set_page_load_timeout(1000)
url = 'http://www.mfinante.gov.ro/patrims.html?adbAdbId=30'
driver.get(url)

try:
    myElem = WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.ID, 'patrims')))
    print("Page is ready!")

except TimeoutException:
        print("Loading took too much time!")


rows = driver.find_elements_by_xpath("//table[@id='patrims']/tbody/tr")
print(len(rows))

listofdicts = []

def builder(outputlist, inputlist):
    #i =0
    for row in inputlist:
        #i+=1
        #print(i)
        soup = BeautifulSoup(row.get_attribute('innerHTML')  , 'html.parser')
        td= soup.find_all('td')


        d = {   "Legend" : soup.find("legend").get_text().strip(),
                "Localitatea" : td[2].get_text().strip(),
                "Strada" : td[4].get_text().strip(),
                "Descriere Tehnica" : td[6].get_text().strip(),
                "Cod de identificare" : td[-7].get_text().strip(),
                "Anul dobandirii sau darii in folosinta " : td[-6].get_text().strip(),
                "Valoare" : td[-5].get_text().strip(),
                "Situatie juridica" : td[-4].get_text().strip(),
                "Situatie juridica actuala" : td[-3].get_text().strip(),
                "Tip bun" : td[-2].get_text().strip(),
                "Stare bun" : td[-1].get_text().strip(),

            }


        outputlist.append(d)
    print('done!')



builder(listofdicts, rows)

print('writing result')
frame = pd.DataFrame(listofdicts)
frame.to_csv(r'output30.csv')

Extract content from a continuously updating web page , using Python

Trending Articles

ZARIA CUMMINGS

LEGO® Marvel Avengers + DLCs [US]

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

KCPE 2013 Results: County order of ranking

Mtu mwenye Div four ya 26,unaweza kusomea nini??

BREAKING NEWS: Early success in Chinn appeal bid

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

BigXthaPlug – TAKE CARE (DELUXE) [iTunes Plus M4A]

99 Rain Status for Whatsapp - Best Rain Dp Collection

Moondru Mudichu 07-06-2016 – Polimer tv Serial

Practice Sheet of Right form of verbs for HSC Students

CC1310: FCC ID Help

Michel Roux roast duck with cherries, cherry sauce and potatoes recipe on...

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

Autodesk AutoCAD 2015 Portable (Win64)

Kumbalangi Nights - English (1CD ) - subtitles

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Wutah – Kotosa ( Prod by Appietus ) ThrowBack

Sheila Mwanyigha Biography, Boyfriend,Marriage and Tribe