Python3 Parse more than 30 videos at a time from youtube

Ellis

I recently decided to get into parsing with python, i made up a project where i need to get data from all of a youtubers videos. I decided it would be easy to just go to the video tab in their channel and parse it for all if its links. However when i do parse it i can only get 30 videos at a time. I was wondering why this is because the link never seems to change when you load more. As well as if there was a way around it. Here is my code

import bs4 as bs
import requests

page = requests.get("/run/media/morpheous/PORTEUS/Workspace/Python/Parsing/parse.py")
soup = bs.BeautifulSoup(page.text, 'html.parser')
soup.find_all("a", "watch-view-count")
k = soup.find_all("div", "yt-uix-sessionlink yt-uix-tile-link  spf-link  yt-ui-ellipsis yt-ui-ellipsis-2")
storage = open('data.csv', 'a')
storage.write(k.get('href')
storage.close()

Any help is appreciated, thanks

Taylan Aydinli

I should first say that I agree with @jonrsharpe. Using the YouTube API is the more sensible choice.

However, if you must do this by scraping, here's a suggestion.

Let's take MKBHD's videos page as an example. The Load more button at the bottom of the page has a button tag with this attribute (You can use your browser's 'inspect element' feature to see this value):

data-uix-load-more-href="/browse_ajax?action_continuation=1&continuation=4qmFsgJAEhhVQ0JKeWNzbWR1dllFTDgzUl9VNEpyaVEaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk1yZ0JBQSUzRCUzRA%253D%253D"

When you click the Load more button, it makes an AJAX request to this /browse_ajax url. The response is a JSON object that looks like this:

{
    content_html: "the html for the videos",
    load_more_widget_html: "      \n\n\n\n    \u003cbutton class=\"yt-uix-button yt-uix-button-size-default yt-uix-button-default load-more-button yt-uix-load-more browse-items-load-more-button\" type=\"button\" onclick=\";return false;\" aria-label=\"Load more\n\" data-uix-load-more-href=\"\/browse_ajax?action_continuation=1\u0026amp;continuation=4qmFsgJAEhhVQ0JKeWNzbWR1dllFTDgzUl9VNEpyaVEaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk03Z0JBQSUzRCUzRA%253D%253D\" data-uix-load-more-target-id=\"channels-browse-content-grid\"\u003e\u003cspan class=\"yt-uix-button-content\"\u003e  \u003cspan class=\"load-more-loading hid\"\u003e\n      \u003cspan class=\"yt-spinner\"\u003e\n      \u003cspan class=\"yt-spinner-img  yt-sprite\" title=\"Loading icon\"\u003e\u003c\/span\u003e\n\nLoading...\n  \u003c\/span\u003e\n\n  \u003c\/span\u003e\n  \u003cspan class=\"load-more-text\"\u003e\n    Load more\n\n  \u003c\/span\u003e\n\u003c\/span\u003e\u003c\/button\u003e\n\n\n"
}

The content_html contains the html for the new page of videos. You can parse that to get the videos in that page. To get to the next page, you need to use the load_more_widget_html value and extract the url which again looks like:

data-uix-load-more-href="/browse_ajax?action_continuation=1&continuation=4qmFsgJAEhhVQ0JKeWNzbWR1dllFTDgzUl9VNEpyaVEaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk1yZ0JBQSUzRCUzRA%253D%253D"

The only thing in that url that changes is the value of the continuation parameter. You can keep making requests to this 'continuation' url, until the returning JSON object does not have the load_more_widget_html.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Python3 Parse more than 30 videos at a time from youtube

From Dev

How can I get a list of youtube videos that have more than an X amount of likes from a specific category

From Dev

How can I get a list of youtube videos that have more than an X amount of likes from a specific category

From Dev

youtube api v3 fetch all videos from a channel that are newer than a video

From Dev

Wrong time calculated having more than 30min

From Dev

download videos from youtube

From Dev

Why is this script taking more time in python3 than python2?

From Dev

Retrieve videos from youtube playlist

From Dev

Youtube api v3 - Get all of videos from the chanel

From Dev

Youtube API v3 - Related videos from specific channel

From Dev

Youtube API v3 - Related videos from specific channel

From Dev

Is there any way to merge more than 2 videos?

From Dev

Retrieve all videos from youtube playlist using youtube v3 API

From Dev

Get videos from youtube api from channel

From Dev

Get videos from youtube api from channel

From Dev

Curl from php takes more time than curl via putty

From Dev

Insert records into a table from more than one users at same time

From Dev

How to parse same argument twice or more than once to python script?

From Dev

python parse and get more input from line

From Dev

Get all youtube videos from a channel (some videos are missing)

From Dev

Check if more than 30 minutes has passed

From Dev

Show more than 30 attachments in document?

From Dev

Show more than 30 attachments in document?

From Dev

Number of player objects more than 30

From Dev

Application.run with more than 30 arguments

From Dev

Displaying Youtube Videos From Specific Channel

From Dev

Cannot pull videos from youtube uploads

From Dev

Prevent embedded YouTube videos from autoplaying?

From Dev

How to download videos from YouTube with subtitles?

Related Related

  1. 1

    Python3 Parse more than 30 videos at a time from youtube

  2. 2

    How can I get a list of youtube videos that have more than an X amount of likes from a specific category

  3. 3

    How can I get a list of youtube videos that have more than an X amount of likes from a specific category

  4. 4

    youtube api v3 fetch all videos from a channel that are newer than a video

  5. 5

    Wrong time calculated having more than 30min

  6. 6

    download videos from youtube

  7. 7

    Why is this script taking more time in python3 than python2?

  8. 8

    Retrieve videos from youtube playlist

  9. 9

    Youtube api v3 - Get all of videos from the chanel

  10. 10

    Youtube API v3 - Related videos from specific channel

  11. 11

    Youtube API v3 - Related videos from specific channel

  12. 12

    Is there any way to merge more than 2 videos?

  13. 13

    Retrieve all videos from youtube playlist using youtube v3 API

  14. 14

    Get videos from youtube api from channel

  15. 15

    Get videos from youtube api from channel

  16. 16

    Curl from php takes more time than curl via putty

  17. 17

    Insert records into a table from more than one users at same time

  18. 18

    How to parse same argument twice or more than once to python script?

  19. 19

    python parse and get more input from line

  20. 20

    Get all youtube videos from a channel (some videos are missing)

  21. 21

    Check if more than 30 minutes has passed

  22. 22

    Show more than 30 attachments in document?

  23. 23

    Show more than 30 attachments in document?

  24. 24

    Number of player objects more than 30

  25. 25

    Application.run with more than 30 arguments

  26. 26

    Displaying Youtube Videos From Specific Channel

  27. 27

    Cannot pull videos from youtube uploads

  28. 28

    Prevent embedded YouTube videos from autoplaying?

  29. 29

    How to download videos from YouTube with subtitles?

HotTag

Archive