The internet suggested that some sites have anti-scraping protections and may throttle or block requests, as it uses HTTP1.0. On this API httpx does seem to be much faster, but nowhere near as fast as curl or the browser.
Some examples - this Python snippet takes ca. 4 seconds:
import httpx
client = httpx.Client(http2=True)
response = client.get('
https://v6.db.transport.rest/stations?query=berlin')
https://v6.db.transport.rest/stations?query=berlin')
/> print(response.text)
This takes up to a minute:
import requests
response = requests.get('
https://v6.db.transport.rest/stations?query=berlin')
print(response.text)
This returns almost immediately:
import subprocess
command = 'curl \'
https://v6.db.transport.rest/stations?query=berlin\' -s'
result = subprocess.run(command, capture_output=True, shell=True, text=True)
print(result.stdout)
print(result.stderr)
What's the magic here? Thanks in advance for any hints.
0 comments:
Post a Comment
Thanks