我正在尝试为受欢迎的汽车网站构建快速刮板。我可以得到一辆车的结果,但无法弄清楚如何退回页面上的所有车。findAll()
抛出错误。任何帮助,将不胜感激
from bs4 import BeautifulSoup
import requests
#search = input('Enter car to search: ')
url = 'https://www.donedeal.ie/cars?words=bmw' #+ search
site = requests.get(url)
page = site.content
soup = BeautifulSoup(page, 'html.parser')
print("URL: ", site.url)
if site.status_code == 200:
print("HTTP Status: ", site.status_code, "\n")
else:
print("Bad HTTP response", "\n")
cars = soup.find('div', attrs={'class': 'top-info'})
county = soup.find('span', attrs={'class': 'county-disp icon-pin'})
span = cars.find('span')
for result in span:
for result2 in county:
print(result, "-", result2)
我不确定您要提取哪些信息。假设您需要汽车类型和县信息,findAll()
可以使用以下方法:
>>> cars = soup.findAll('div', attrs={'class': 'top-info'})
>>> for car in cars:
... loc = car.find('span', attrs={'class': 'county-disp icon-pin'})
... if loc:
... print('type:', car.text, 'location:', loc.text)
... else:
... print('type:', car.text)
type: Bmw 320 CdTipperary location: Tipperary
type: Bmw 520d MsportDonegal location: Donegal
type: BMW2004
type: BMW2010
type: Bmw2010
type: Bmw2000
type: Bmw2001
type: Bmw2004
type: Bmw2004
type: bmw2003
type: BMW2009
type: Bmw2010
type: Bmw1990
type: BMW2004
type: BMW2012
type: Bmw2000
type: bmw2001
type: BMW2004
type: BMW2008
type: BMW2005
type: Bmw2006
type: Bmw2002
type: BMW2004
type: Bmw2000
type: BMW2003
type: BMW2011
type: BMW2001
type: Bmw2000
type: Bmw2002
type: BMW2007
请注意,仅一页。您将不得不执行其他页面的URL。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句