我正在尝试从此网页(日历下方的框)中获取可用时隙的小时数:
https://magicescape.it/le-stanze/lo-studio-di-harry-houdini/
我已经阅读了其他相关问题并编写了此代码
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup
url = 'https://magicescape.it/le-stanze/lo-studio-di-harry-houdini/'
wait_time = 10
options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
driver.get(url)
driver.switch_to.frame(0)
wait = WebDriverWait(driver, wait_time)
first_result = wait.until(presence_of_element_located((By.ID, "sb_main")))
soup = BeautifulSoup(driver.page_source, 'html.parser')
print(soup)
driver.quit()
切换到包含时隙的iframe之后,我可以通过打印得到 soup
<script id="time_slots_view" type="text/html"><div class="slots-view{{#ifCond (getThemeOption 'timeline_modern_display') '==' 'as_table'}} as-table{{/ifCond}}">
<div class="timeline-wrapper">
<div class="tab-pd">
<div class="container-caption">
{{_t 'available_services_on_this_day'}}
</div>
{{#if error_message}}
<div class="alert alert-danger alert-dismissible" role="alert">
{{error_message}}
</div>
{{/if}}
{{>emptyTimePart is_empty=is_empty is_loaded=is_loaded}}
<div id="sb_time_slots_container"></div>
{{> bookingTimeLegendPart legend="only_available" time_diff=0}}
</div>
</div>
</div></script>
<script id="time_slot_view" type="text/html"><div class="slot">
<a class="sb-cell free {{#ifPluginActive 'slots_count'}}{{#if available_slots}}has-available-slot{{/if}}{{/ifPluginActive}}" href="#{{bookingStepUrl time=time date=date}}">
{{formatDateTime datetime 'time' time_diff}}
{{#ifCond (getThemeOption 'timeline_show_end_time') '==' 1}}
-<span class="end-time">
{{formatDateTime end_datetime 'time' time_diff}}
</span>
{{/ifCond}}
{{#ifPluginActive 'slots_count'}}
{{#if available_slots}}
<span class="slot--available-slot">
{{available_slots}}
{{#ifConfigParam 'slots_count_show_total' '==' true}} / {{total_slots}} {{/ifConfigParam}}
</span>
{{/if}}
{{/ifPluginActive}}
</a>
</div></script>
而从右键单击>检查网页中的元素,我得到了
<div class="slots-view">
<div class="timeline-wrapper">
<div class="tab-pd">
<div class="container-caption">
Orari d'inizio disponibili
</div>
<div id="sb_time_slots_container">
<div class="slot">
<a class="sb-cell free " href="#book/location/4/service/6/count/1/provider/6/date/2020-03-09/time/23:00:00/">
23:00
</a>
</div>
</div>
<div class="time-legend">
<div class="available">
<div class="circle">
</div>
- Disponibile
</div>
</div>
</div>
</div>
</div>
如何使用硒获取可用插槽的小时数(在此示例中为23:00)?
要获得所需的响应,您需要:
iframe
要切换到的位置(并切换到该位置)。您试图切换到frame[0]
但需要frame[1]
。以下代码消除了对索引的依赖,xpath
而是使用了。xpath
来标识div
元素的所有子元素id=sb_time_slots_container
。div
的和得到的文本属性,嵌套内的<a>
这些中div
的。对于步骤1和2,您还应该使用wait.until
以便可以加载内容。
...
driver.get(url)
wait = WebDriverWait(driver, wait_time)
# Wait until the iframe exists then switch to it
iframe_element = wait.until(presence_of_element_located((By.XPATH, '//*[@id="prenota"]//iframe')))
driver.switch_to.frame(iframe_element)
# Wait until the times exist then get an array of them
wait.until(presence_of_element_located((By.XPATH, '//*[@id="sb_time_slots_container"]/div')))
all_time_elems = driver.find_elements_by_xpath('//*[@id="sb_time_slots_container"]/div')
# Iterate over each element and print the time out
for elem in all_time_elems:
print(elem.find_element_by_tag_name("a").text)
driver.quit()
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句