I'm trying to iteratively crawler the tables from each page on this website. With the code below I'm able to extract one page only:
import requests
import json
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
url = 'http://bjjs.zjw.beijing.gov.cn/eportal/ui?pageId=308894'
website_url = requests.get(url).text
#soup = BeautifulSoup(website_url, 'lxml')
soup = BeautifulSoup(website_url, 'html.parser')
table = soup.find('table', {'class': 'gridview'})
#https://stackoverflow.com/questions/51090632/python-excel-export
df = pd.read_html(str(table))[0]
print(df.head(5))
Output:
序号 ... 竣工备案日期
0 1 ... 2020-01-23
1 2 ... 2020-01-23
2 3 ... 2020-01-23
3 4 ... 2020-01-23
4 5 ... 2020-01-23
[5 rows x 9 columns]
Any ideas how could I get each page content by click next page button on the web? Thank you.
