我是网络爬虫的新手。我正在使用漂亮的汤来提取Google Play商店。但是,我坚持从div标签检索文本。Div标签如下所示:
a = <`div class="LVQB0b"><div class="QoPmEb"></div><div><span class="X43Kjb">Education.com</span><span class="p2TkOb">August 15, 2019</span></div>Thanks for your feedback. We are sorry to hear you're having trouble with the app. This is a known issue and our team has fixed it. Please restart the app and let us know at [email protected] if you have any further trouble. Thanks!</div>`
我想从“感谢您的反馈”开始检索文本。我使用以下代码检索文本:
response = a.find('div',{'class':'LVQB0b'}).get_text()
但是,上述命令还会返回不需要的文本,例如“ Education.com”和日期。我不确定如何从不具有类名的div标签中检索文本,如上例所示。等待您的指导。
采用 find(text=True, recursive=False)
例如:
from bs4 import BeautifulSoup
s = '''<div class="LVQB0b"><div class="QoPmEb"></div><div><span class="X43Kjb">Education.com</span><span class="p2TkOb">August 15, 2019</span></div>Thanks for your feedback. We are sorry to hear you're having trouble with the app. This is a known issue and our team has fixed it. Please restart the app and let us know at [email protected] if you have any further trouble. Thanks!</div>'''
html = BeautifulSoup(s, 'html.parser')
print(html.find('div',{'class':'LVQB0b'}).find(text=True, recursive=False))
输出:
Thanks for your feedback. We are sorry to hear you're having trouble with the app. This is a known issue and our team has fixed it. Please restart the app and let us know at [email protected] if you have any further trouble. Thanks!
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句