Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soup.find() returning 'None', resulting in AttributeError: 'NoneType' object has no attribute 'get_text #20

Open
CoderStylus opened this issue Apr 30, 2024 · 2 comments

Comments

@CoderStylus
Copy link

CoderStylus commented Apr 30, 2024

Traceback (most recent call last):
File "/workspaces/pcpartpickertest/index.py", line 7, in
parts = pcpp.part_search("i7")
File "/home/codespace/.python/current/lib/python3.10/site-packages/pypartpicker/scraper.py", line 232, in part_search
soup = self.__make_soup(f"{search_link}&page={i + 1}")
File "/home/codespace/.python/current/lib/python3.10/site-packages/pypartpicker/scraper.py", line 95, in _make_soup
if "Verification" in soup.find(class
="pageTitle").get_text():
AttributeError: 'NoneType' object has no attribute 'get_text'

The error is returned by this example code:

from pypartpicker import Scraper

pcpp = Scraper()
parts = pcpp.part_search("i7")


for part in parts:
    print(part.name)

first_product_url = parts[0].url
product = pcpp.fetch_product(first_product_url)
print(product.specs)`

The issue is with the last line of code in the error message, which tells us that it occurs at scraper.py, line 95, in __make_soup.

The key is in the statement:

if "Verification" in soup.find(class_="pageTitle").get_text():

The call to soup.find should return some object reference on which a call to get_text will return some data. But if soup.find does not succeed it does not return any object (in reality it returns None). So the call to get_text is impossible because there is no object to call it from. Which results in the error message: AttributeError: 'NoneType' object has no attribute 'get_text'

This may be a problem with the scraper.py code or the Soup package itself.

@thefakequake
Copy link
Owner

This issue is likely occurring due to a cloudflare bot verification check when the library makes the request to pcpartpicker.
This can be solved by using custom HTTP headers, the same as the ones your browser users in order to bypass the check.

When creating the instance of the Scraper class, pass in a dictionary headers with the same HTTP headers as your browser.

@thefakequake
Copy link
Owner

I will think about adding a new CloudflareCheck error to the library to make it more clear that this is the case, as the current error is confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants