Need to find the requests equivalent of openurl() from urllib2

Tom

I am currently trying to modify a script to use the requests library instead of the urllib2 library. I haven't really used it before and I am looking to do the equivalent of urlopen("http://www.example.org").read(), so I tried the requests.get("http://www.example.org").text function.

This works fine with normal everyday html, however when I fetch from this url (https://gtfsrt.api.translink.com.au/Feed/SEQ) it doesn't seem to work.

So I wrote the below code to print out the responses from the same url using both the requests and urllib2 libraries.

import urllib2
import requests

#urllib2 request
request = urllib2.Request("https://gtfsrt.api.translink.com.au/Feed/SEQ")
result = urllib2.urlopen(request)

#requests request
result2 = requests.get("https://gtfsrt.api.translink.com.au/Feed/SEQ")
print result2.encoding

#urllib2 write to text
open("Output.txt", 'w').close()
text_file = open("Output.txt", "w")
text_file.write(result.read())
text_file.close()

open("Output2.txt", 'w').close()
text_file = open("Output2.txt", "w")
text_file.write(result2.text)
text_file.close()

The openurl().read() works fine but the requests.get().text doesn't work for the given this url. I suspect it has something to do with encoding, but i don't know what. Any thoughts?

Note: The supplied url is a feed in the google protocol buffer format, once I receive the message i give the feed to a google library that interprets it.

Lukas Graf

Your issue is that you're making the requests module interpret binary content in a response as text.

A response from the requests library has two main way to access the body of the response:

Since protocol buffers are a binary format, you should use result2.content in your code instead of result2.text.


Response.content will return the body of the response as-is, in bytes. For binary content this is exactly what you want. For text content that contains non-ASCII characters this means the content must have been encoded by the server into a bytestring using a particular encoding that is indicated by either a HTTP header or a <meta charset="..." /> tag. In order to make sense of those bytes they therefore need to be decoded after receiving using that charset.

Response.text now is a convenience method that does exactly this for you. It assumes the response body is text, and looks at the response headers to find the encoding, and decodes it for you, returning unicode.

But if your response doesn't contain text, this is the wrong method to use. Binary content doesn't contain characters, because it's not text, so the whole concept of character encoding does not make any sense for binary content - it's only applicable to text composed of characters. (That's also why you're seeing response.encoding == None - it's just bytes, there is no character encoding involved).

See Response Content and Binary Response Content in the requests documentation for more details.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Java

What are the differences between the urllib, urllib2, urllib3 and requests module?

From Dev

Python requests lib, is requests.Session equivalent to urllib2's opener?

From Dev

Find elements which need to be removed from an array such that 2*min>max

From Dev

get restricted page after login using requests,urllib2 python

From Dev

openURL from App Extension

From Dev

Get a header with Python and convert in JSON (requests - urllib2 - json)

From Dev

How to debug urllib2 requests via proxy

From Dev

urllib2 / requests does not display iframe of the webpage

From Dev

Requests doesn't work while Urllib2 does

From Dev

problems getting data from FlightRadar24 with urllib2

From Dev

Trying to download page in python with urllib2 and requests but keep getting redirected

From Dev

http requests : from urllib2 to requests python 2.7

From Dev

Is urllib2 slower than requests in python3

From Dev

Beautifulsoup, urllib2 and requests did not find all HTML tags from 9gag.com

From Dev

Python HTTPS requests (urllib2) to some sites fail on Ubuntu 12.04 without proxy

From Dev

Python HTTPS requests (urllib2) to some sites fail on Ubuntu 12.04 without proxy

From Dev

Switching from urllib2 to requests, strangely different results with the same parameters

From Dev

Python requests lib, is requests.Session equivalent to urllib2's opener?

From Dev

Find elements which need to be removed from an array such that 2*min>max

From Dev

urllib2 HTTPPasswordMgrWithDefaultRealm 'loses' password after 5 requests

From Dev

What factors should be considered when using urllib vs urllib2 vs requests vs http.client

From Dev

I need to find related data from 2 associative arrays

From Dev

Strange Output from Python urllib2

From Dev

Numpy loadtxt works with urllib2 response but not requests response

From Dev

Multiple URL requests to API without getting error from urllib2 or requests

From Dev

Need to reinstall urllib2 for Python 2.7

From Dev

Rest API programming: Requests vs urllib2 in Python 2.7 -- the API needs authentication (Error 401)

From Dev

Django can't make external connections with requests or urllib2 on development server

From Dev

Why am I getting a 403 status when using requests VS urllib2?

Related Related

  1. 1

    What are the differences between the urllib, urllib2, urllib3 and requests module?

  2. 2

    Python requests lib, is requests.Session equivalent to urllib2's opener?

  3. 3

    Find elements which need to be removed from an array such that 2*min>max

  4. 4

    get restricted page after login using requests,urllib2 python

  5. 5

    openURL from App Extension

  6. 6

    Get a header with Python and convert in JSON (requests - urllib2 - json)

  7. 7

    How to debug urllib2 requests via proxy

  8. 8

    urllib2 / requests does not display iframe of the webpage

  9. 9

    Requests doesn't work while Urllib2 does

  10. 10

    problems getting data from FlightRadar24 with urllib2

  11. 11

    Trying to download page in python with urllib2 and requests but keep getting redirected

  12. 12

    http requests : from urllib2 to requests python 2.7

  13. 13

    Is urllib2 slower than requests in python3

  14. 14

    Beautifulsoup, urllib2 and requests did not find all HTML tags from 9gag.com

  15. 15

    Python HTTPS requests (urllib2) to some sites fail on Ubuntu 12.04 without proxy

  16. 16

    Python HTTPS requests (urllib2) to some sites fail on Ubuntu 12.04 without proxy

  17. 17

    Switching from urllib2 to requests, strangely different results with the same parameters

  18. 18

    Python requests lib, is requests.Session equivalent to urllib2's opener?

  19. 19

    Find elements which need to be removed from an array such that 2*min>max

  20. 20

    urllib2 HTTPPasswordMgrWithDefaultRealm 'loses' password after 5 requests

  21. 21

    What factors should be considered when using urllib vs urllib2 vs requests vs http.client

  22. 22

    I need to find related data from 2 associative arrays

  23. 23

    Strange Output from Python urllib2

  24. 24

    Numpy loadtxt works with urllib2 response but not requests response

  25. 25

    Multiple URL requests to API without getting error from urllib2 or requests

  26. 26

    Need to reinstall urllib2 for Python 2.7

  27. 27

    Rest API programming: Requests vs urllib2 in Python 2.7 -- the API needs authentication (Error 401)

  28. 28

    Django can't make external connections with requests or urllib2 on development server

  29. 29

    Why am I getting a 403 status when using requests VS urllib2?

HotTag

Archive