I'm trying to make web scraping with Python 3.10, and the library requests-HTML 0.10.0.
I attach the code:
from requests_html import HTMLSession
url = '
https://bodysolid-europe.com/collections/all'
/>
s = HTMLSession()
r = s.get(url)
r.html.render(sleep=1)
products = r.html.xpath('/html/body/div[2]/div[2]/div', first=True)
for item in products.absolute_links:
r = s.get(item)
print(r.html.find('header.product-header', first=True).text)
When I try to extract information from the URL by Xpath, in the console shows the next output:
[D:urllib3.connectionpool] Starting new HTTPS connection (%d): %s:%s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:asyncio] Using proactor: %s
[D:websockets.client] = connection is CONNECTING
[D:websockets.client] > GET %s HTTP/1.1
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] > %s: %s
[D:websockets.client] < HTTP/1.1 %d %s
It doesn't show all the information from the items, only a little bit, like these:
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Body-Solid Europe
Best Fitness Dumbbell Rack BFDR10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Bench BFFID10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Mountain Climber BFMC10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Body-Solid Europe
Best Fitness Multi-Station Gym BFMG30
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Center Drive Elliptical BFE1
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Olympic Bench BFOB10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Functional Trainer BFFT10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Leg Developer and Preacher Curl Attachment BFPL10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Best Fitness
Best Fitness Inversion Table BFINVER10
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
[D:urllib3.connectionpool] %s://%s:%s "%s %s %s" %s %s
Body-Solid Europe
The most of the output are only:
D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
[D:websockets.client] < %s
I don't know what the problem is. I already installed the pyppeteer==1.0.0, because previously, I had this:
NoSuchKey. The specified key does not exist. No such object: chromium-browser-snapshots/Win_x64/1181205/chrome-win.zip
but now it's showing the "[D:websockets.client] < %s
[D:websockets.client] < %s"
I need to fix that error with the output to get the information from the URL by web scraping.
Python Web scraping [D:websockets.client] > GET %s HTTP/1.1 [D:websockets.client] > %s: %s doesn't show all the results
Programing Coderfunda
September 06, 2024
No comments
Related Posts:
how can I integrate continuous data and static data using random forest machine learning model?I am using random forest regression model to predict groundwater level changes. I am using continuous inputs (timeseries data) such as GRACE, Precipit… Read More
Invalid Grant Error with MSAL in Next.js - Insufficient Permissions (AADB2C90205)I am implementing login functionality in my Next.js application using Azure AD B2C and the MSAL library. However, I am encountering the following erro… Read More
✅ Command Validator: validate the input of console commandsCommand Validator is a Laravel package to validate the input of console commands. ✅ https://github.com/cerbero90/command-validator The validatio… Read More
Getting Authentication error when connecting to another private repo from the github actions workflowI'm using the self-hosted runner in the GitHub actions workflow. Facing issue when running the terraform init command. It throws the Authentication er… Read More
Laracon US Keynote Framework Updates Are Now In Laravel 11.23--- The Laravel team released v11.23 this week, with the Laracon US 2024 open-source updates like defer(), concurrency, contextual container attr… Read More
0 comments:
Post a Comment
Thanks