bs4 pythonを使用したWebスクレイピング：サッカーの対戦を表示する方法

debugcn 投稿 Dev

aabdall

私はPythonの初心者で、skysports.comからサッカー/サッカーのスケジュールを取得し、SMS経由でTwilio経由で携帯電話に送信するプログラムを作成しようとしています。SMSコードを理解したので除外したので、これまでに行き詰まっているWebスクレイピングコードを次に示します。

import requests
from bs4 import BeautifulSoup

URL = "https://www.skysports.com/football-fixtures"
page = requests.get(URL)

results = BeautifulSoup(page.content, "html.parser")

d = defaultdict(list)

comp = results.find('h5', {"class": "fixres__header3"})
team1 = results.find('span', {"class": "matches__item-col matches__participant matches__participant--side1"})
date = results.find('span', {"class": "matches__date"})
team2 = results.find('span', {"class": "matches__item-col matches__participant matches__participant--side2"})

for ind in range(len(d)):
    d['comp'].append(comp[ind].text)
    d['team1'].append(team1[ind].text)
    d['date'].append(date[ind].text)
    d['team2'].append(team2[ind].text)

島

以下はあなたのためのトリックを行う必要があります：

   from bs4 import BeautifulSoup
   import requests
    
    a = requests.get('https://www.skysports.com/football-fixtures')
    soup = BeautifulSoup(a.text,features="html.parser")
    
    teams = []
    for date in soup.find_all(class_="fixres__header2"): # searching in that date
        for i in soup.find_all(class_="swap-text--bp30")[1:]: #skips the first one because that's a heading
            teams.append(i.text)
    
    date = soup.find(class_="fixres__header2").text
    print(date)
    teams = [i.strip('\n') for i in teams]
    for x in range(0,len(teams),2):
        print (teams[x]+" vs "+ teams[x+1])

私が行ったことをさらに説明しましょう：すべてのサッカーにはこのクラス名があります--swap-text--bp30

したがって、find_allを使用して、その名前のすべてのクラスを抽出できます。

結果が得られたら、それらを配列「teams = []」に入れて、forループ「team.append（i.text）」に追加します。「.text」はhtmlを削除します

次に、配列内の「\ n」を削除し、配列内の各文字列を2つずつ出力することで削除できます。これが最終出力になります。

編集：リーグのタイトルを削るために、私たちはほとんど同じことをします：

league = []
for date in soup.find_all(class_="fixres__header2"): # searching in that date
    for i in soup.find_all(class_="fixres__header3"): #skips the first one because that's a heading
        league.append(i.text)

配列を取り除き、別の配列を作成します。

league = [i.strip('\n') for i in league]
final = []

次に、この最後のコードを追加します。これは、基本的にリーグを印刷してから、2つのチームを何度も印刷するだけです。

for x in range(0,len(teams),5):
    final.append(teams[x]+" vs "+ teams[x+1])

for i in league:
    print(i)
    for i in final:
        print(i)

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-14

コメントを追加

サインイン

分類Dev

Python // BS4 //タグ

分類Dev

python と bs4 を使用した Web スクレイピング

分類Dev

Web Scraping WSJ Archive with BS4

分類Dev

Extracting certain text in python BS4?

分類Dev

Pythonを使用してbs4で<script>にWebスクレイピングしてアクセスする方法

分類Dev

セレンとbs4を使用したWebスクレイピング

分類Dev

Pythonとbs4を使用したデータスクレイピング

分類Dev

BS4を使用したWebスクレイピングWSJアーカイブ

分類Dev

bs4を使用したPythonの解析

分類Dev

スクレイピング：bS4で特定のタグを除外する方法

分類Dev

bs4を使用したWebスクレイピングは、空の結果を出力します

分類Dev

python、bs4で特定のjsonに保存するWebスクレイピング

分類Dev

Web スクレイピング、Python で bs4 を使用して 2 つの同じタグからデータを抽出する方法

分類Dev

bs4を使用してPythonでシングルページアプリケーションのWebサイトをスクレイプする方法

分類Dev

bs4を使用したHTML解析

分類Dev

BS4またはSeleniumを使用したfinishline.comからのWebスクレイピング

分類Dev

Pythonとbs4を使用して画像から「タイトル」をスクレイピングする

分類Dev

web scrape save to specific json in python, bs4

分類Dev

PythonとBS4を使用した郡規模での住所のスクレイピング

分類Dev

Bs4とPythonの問題

分類Dev

Python 3 BS4 - Extract Data from <span> tags

分類Dev

Get the count of a phrase in a url using python and bs4

分類Dev

Python BS4 - Write to JSON using a variable as key & value

分類Dev

TypeError: cannot concatenate 'str' and 'NoneType' objects python bs4

分類Dev

Screen scraping based on title using python bs4

分類Dev

崇高でbs4エラー（Windows 10）

分類Dev

bs4の非常に奇妙な動作

分類Dev

bs4を使用してウェブスクレイピングの複雑なエッジケースを処理する方法は？

分類Dev

Webスクレイピングbs4、結果を取得する方法を理解できません

Related 関連記事

記事