我编写了一个程序来从字符串中提取链接。
请检查下面的代码。
def next_target(page):
if (page.find('<a href') == -1):
return 0,0
link = (page.find('<a href='))
first_quote = (page.find('"', link))
second_quote = (page.find('"', first_quote + 1))
url = page[first_quote + 1 : second_quote]
return url, second_quote
def get_all_links(page):
while True:
url, endpos = next_target(page)
if (url):
print (url)
page = page[endpos :]
else:
break
print (get_all_links('<div class="no-padding locale-column col-md-3"><a href="https://ee.000webhost.com/" class=""><span class="flag flag-ee"></span><span class="region">Eesti</span> <span class="language">Eesti</span></a><a href="https://es.000webhost.com/" class=""><span class="flag flag-es"></span><span class="region">España</span> <span class="language">Español</span></a><a href="https://fi.000webhost.com/" class=""><span class="flag flag-fi"></span><span class="region">Suomi</span> <span class="language">Suomi</span></a><a href="https://fr.000webhost.com/" class=""><span class="flag flag-fr"></span><span class="region">France</span> <span class="language">Français</span></a><a href="https://gr.000webhost.com/" class=""><span class="flag flag-gr"></span><span class="region">Ελλάδα</span> <span class="language">Ελληνικά</span></a><a href="https://hr.000webhost.com/" class=""><span class="flag flag-hr"></span><span class="region">Hrvatska</span> <span class="language">Hrvatski</span></a><a href="https://hu.000webhost.com/" class=""><span class="flag flag-hu"></span><span class="region">Magyarország</span> <span class="language">Magyar</span></a><a href="https://in.000webhost.com/" class=""><span class="flag flag-en-in"></span><span class="region">India</span> <span class="language">English</span></a><a href="https://th.000webhost.com/" class=""><span class="flag flag-th"></span><span class="region">ประเทศไทย</span> <span class="language">ไทย</span></a></div>'))
该程序正在提取所有链接。但与链接一起,None
还会返回一个值。None
提取链接后如何停止出现?
我编写了另一个程序来做完全相同的事情。但是,它也会返回一个None
值。
请检查以下代码:
def page_pro(page):
end_quote = 0
if (page.find('<a href') == -1):
return None
else:
while (page.find('<a href', end_quote) != -1):
start_link = (page.find('<a href=', end_quote))
first_quote = (page.find('"', start_link))
end_quote = (page.find('"', first_quote + 1))
url = page[first_quote + 1 : end_quote]
print (url)
print (page_pro('<div class="no-padding locale-column col-md-3"><a href="https://ee.000webhost.com/" class=""><span class="flag flag-ee"></span><span class="region">Eesti</span> <span class="language">Eesti</span></a><a href="https://es.000webhost.com/" class=""><span class="flag flag-es"></span><span class="region">España</span> <span class="language">Español</span></a><a href="https://fi.000webhost.com/" class=""><span class="flag flag-fi"></span><span class="region">Suomi</span> <span class="language">Suomi</span></a><a href="https://fr.000webhost.com/" class=""><span class="flag flag-fr"></span><span class="region">France</span> <span class="language">Français</span></a><a href="https://gr.000webhost.com/" class=""><span class="flag flag-gr"></span><span class="region">Ελλάδα</span> <span class="language">Ελληνικά</span></a><a href="https://hr.000webhost.com/" class=""><span class="flag flag-hr"></span><span class="region">Hrvatska</span> <span class="language">Hrvatski</span></a><a href="https://hu.000webhost.com/" class=""><span class="flag flag-hu"></span><span class="region">Magyarország</span> <span class="language">Magyar</span></a><a href="https://in.000webhost.com/" class=""><span class="flag flag-en-in"></span><span class="region">India</span> <span class="language">English</span></a><a href="https://th.000webhost.com/" class=""><span class="flag flag-th"></span><span class="region">ประเทศไทย</span> <span class="language">ไทย</span></a></div>'))
这里:
print (get_all_links('<div class="no-padding locale-column col-md-3"><a href="https://ee.000webhost.com/" class=""><span class="flag flag-ee"></span><span class="region">Eesti</span> <span class="language">Eesti</span></a><a href="https://es.000webhost.com/" class=""><span class="flag flag-es"></span><span class="region">España</span> <span class="language">Español</span></a><a href="https://fi.000webhost.com/" class=""><span class="flag flag-fi"></span><span class="region">Suomi</span> <span class="language">Suomi</span></a><a href="https://fr.000webhost.com/" class=""><span class="flag flag-fr"></span><span class="region">France</span> <span class="language">Français</span></a><a href="https://gr.000webhost.com/" class=""><span class="flag flag-gr"></span><span class="region">Ελλάδα</span> <span class="language">Ελληνικά</span></a><a href="https://hr.000webhost.com/" class=""><span class="flag flag-hr"></span><span class="region">Hrvatska</span> <span class="language">Hrvatski</span></a><a href="https://hu.000webhost.com/" class=""><span class="flag flag-hu"></span><span class="region">Magyarország</span> <span class="language">Magyar</span></a><a href="https://in.000webhost.com/" class=""><span class="flag flag-en-in"></span><span class="region">India</span> <span class="language">English</span></a><a href="https://th.000webhost.com/" class=""><span class="flag flag-th"></span><span class="region">ประเทศไทย</span> <span class="language">ไทย</span></a></div>'))
或在这里:
print (page_pro('<div class="no-padding locale-column col-md-3"><a href="https://ee.000webhost.com/" class=""><span class="flag flag-ee"></span><span class="region">Eesti</span> <span class="language">Eesti</span></a><a href="https://es.000webhost.com/" class=""><span class="flag flag-es"></span><span class="region">España</span> <span class="language">Español</span></a><a href="https://fi.000webhost.com/" class=""><span class="flag flag-fi"></span><span class="region">Suomi</span> <span class="language">Suomi</span></a><a href="https://fr.000webhost.com/" class=""><span class="flag flag-fr"></span><span class="region">France</span> <span class="language">Français</span></a><a href="https://gr.000webhost.com/" class=""><span class="flag flag-gr"></span><span class="region">Ελλάδα</span> <span class="language">Ελληνικά</span></a><a href="https://hr.000webhost.com/" class=""><span class="flag flag-hr"></span><span class="region">Hrvatska</span> <span class="language">Hrvatski</span></a><a href="https://hu.000webhost.com/" class=""><span class="flag flag-hu"></span><span class="region">Magyarország</span> <span class="language">Magyar</span></a><a href="https://in.000webhost.com/" class=""><span class="flag flag-en-in"></span><span class="region">India</span> <span class="language">English</span></a><a href="https://th.000webhost.com/" class=""><span class="flag flag-th"></span><span class="region">ประเทศไทย</span> <span class="language">ไทย</span></a></div>'))
您正在打印函数的返回值,不返回任何内容。只需省略print(),使用
get_all_links(…)
或者
page_pro(…)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句