我正在尝试从代码中读取第二个div类的内容:div class =“ eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top- 1 eds-event-card-content__sub--cropped“>使用python 3起价为RM15.75
<div class="eds-event-card-content__sub-content">
<div class="eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top-1
eds-event-card-content__sub--cropped">
<div class="card-text--truncated__one">Found8 KL Sentral • Kuala Lumpur, Kuala
Lumpur</div>
</div>
<div class="eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top-1
eds-event-card-content__sub--cropped">Starts at RM15.75</div></div>
我的python代码:
url = 'https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=2'
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
# Select all the 20 event containers from a single page
event_containers = html_soup.find_all('div', class_='search-event-card-square-image')
# Getting price of ticket
price = container.find_all('div', class_= "eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top-1 eds-event-card-content__sub--cropped").text
print("price: ", price[1])
但是我的代码不起作用,它给了我输出:
IndexError: list index out of range
但我想要 Starts at RM15.75
谁能帮我这个?谢谢
我在html源代码中看不到任何价格的东西。我猜它们是使用js脚本生成的。
因此,在这种情况下,您需要使用Selenium。
码:
# import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
from webdriver_manager.chrome import ChromeDriverManager
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=chrome_options)
driver.set_window_size(1024, 600)
driver.maximize_window()
url = 'https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=2'
# response = requests.get(url)
driver.get(url)
time.sleep(4)
html_soupdf = BeautifulSoup(driver.page_source, 'html.parser')
# Select all the 20 event containers from a single page
event_containers = html_soup.find('ul', class_='search-main-content__events-list')
for event in event_containers.find_all('li'):
event_time = event.find('div', class_= "eds-text-color--primary-brand eds-l-pad-bot-1 eds-text-weight--heavy eds-text-bs").text
event_name = event.find('div', class_= "eds-event-card__formatted-name--is-clamped eds-event-card__formatted-name--is-clamped-three eds-text-weight--heavy").text
event_price_place = event.find('div', class_ = "eds-event-card-content__sub-content")
event_pp = event_price_place.find_all('div')
event_place = event_pp[0].text
try:
event_price = event_pp[2].text
except:
event_price = None
print(f"{event_name}\n{event_time}\n{event_place}\n{event_price}\n\n")
结果:
KL International Flea Market 2020 / Bazaar Antarabangsa Kuala Lumpur
Mon, Oct 5, 10:00 AM
VIVA Shopping Mall • Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur
Free
FGTSD Physical Church Service
Sun, Jul 19, 9:30 AM + 105 more events
Full Gospel Tabernacle Sri Damansara • Kuala Lumpur
Free
EFE 2020 - 16th Export Furniture Exhibition Malaysia
Thu, Aug 27, 9:00 AM
Kuala Lumpur Convention Centre • Kuala Lumpur, Kuala Lumpur
Free
International Beauty Expo (IBE) 2020
Sat, Sep 12, 11:00 AM
Malaysia International Trade and Exhibition Centre • Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur
Free
Learn How To Earn USD3500 In 4 Week Using Your SmartPhone
Today at 8:00 PM + 2 more events
KL Online Event • Kuala Lumpur, Bangkok
None
Turn Customers into Raving Fans of Your Brand via Equity Crowdfunding
Thu, Aug 27, 4:00 PM
Found8 KL Sentral • Kuala Lumpur, Kuala Lumpur
Starts at RM15.75
.
.
.
.
.
编辑:
我添加了使其无标题的选项。
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=chrome_options)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句