Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxTriesExceededException: Cannot Fetch from Google Scholar. with print(success) is 【True】 #509

Open
1 task
XianZhi1022 opened this issue Jul 29, 2023 · 0 comments
Labels

Comments

@XianZhi1022
Copy link

XianZhi1022 commented Jul 29, 2023

Describe the bug
MaxTriesExceededException: Cannot Fetch from Google Scholar. with print(success) is 【True】

To Reproduce

test_t= "Scholarly editions in print and on the screen: A theoretical comparison"

from scholarly import scholarly, ProxyGenerator

pg = ProxyGenerator()
success = pg.SingleProxy(http = "http://127.0.0.1:7458", https = 'http://127.0.0.1:7458')
print(success)
scholarly.use_proxy(pg,pg)
#the result is True

search_query = scholarly.search_pubs(test_t)
article = next(search_query)

below is the mistake information:

MaxTriesExceededException                 Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 search_query = scholarly.search_pubs(test_t)
      2 article = next(search_query)
      4 print(article.citedby) # 被引次数`


File G:\Anaconda\lib\site-packages\scholarly\_scholarly.py:160, in _Scholarly.search_pubs(self, query, patents, citations, year_low, year_high, sort_by, include_last_year, start_index)
     97 """Searches by query and returns a generator of Publication objects
     98 
     99 :param query: terms to be searched
   (...)
    155 
    156 """
    157 url = self._construct_url(_PUBSEARCH.format(requests.utils.quote(query)), patents=patents,
    158                           citations=citations, year_low=year_low, year_high=year_high,
    159                           sort_by=sort_by, include_last_year=include_last_year, start_index=start_index)
--> 160 return self.__nav.search_publications(url)

File G:\Anaconda\lib\site-packages\scholarly\_navigator.py:296, in Navigator.search_publications(self, url)
    288 def search_publications(self, url: str) -> _SearchScholarIterator:
    289     """Returns a Publication Generator given a url
    290 
    291     :param url: the url where publications can be found.
   (...)
    294     :rtype: {_SearchScholarIterator}
    295     """
--> 296     return _SearchScholarIterator(self, url)

File G:\Anaconda\lib\site-packages\scholarly\publication_parser.py:53, in _SearchScholarIterator.__init__(self, nav, url)
     51 self._pubtype = PublicationSource.PUBLICATION_SEARCH_SNIPPET if "/scholar?" in url else PublicationSource.JOURNAL_CITATION_LIST
     52 self._nav = nav
---> 53 self._load_url(url)
     54 self.total_results = self._get_total_results()
     55 self.pub_parser = PublicationParser(self._nav)

File G:\Anaconda\lib\site-packages\scholarly\publication_parser.py:59, in _SearchScholarIterator._load_url(self, url)
     57 def _load_url(self, url: str):
     58     # this is temporary until setup json file
---> 59     self._soup = self._nav._get_soup(url)
     60     self._pos = 0
     61     self._rows = self._soup.find_all('div', class_='gs_r gs_or gs_scl') + self._soup.find_all('div', class_='gsc_mpat_ttl')

File G:\Anaconda\lib\site-packages\scholarly\_navigator.py:239, in Navigator._get_soup(self, url)
    237 def _get_soup(self, url: str) -> BeautifulSoup:
    238     """Return the BeautifulSoup for a page on scholar.google.com"""
--> 239     html = self._get_page('https://scholar.google.com{0}'.format(url))
    240     html = html.replace(u'\xa0', u' ')
    241     res = BeautifulSoup(html, 'html.parser')

File G:\Anaconda\lib\site-packages\scholarly\_navigator.py:190, in Navigator._get_page(self, pagerequest, premium)
    188     return self._get_page(pagerequest, True)
    189 else:
--> 190     raise MaxTriesExceededException("Cannot Fetch from Google Scholar.")

MaxTriesExceededException: Cannot Fetch from Google Scholar

Expected behavior
I want solve this problem.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • Proxy service: clash for win
  • python version: 3.9.12
  • OS: win10
  • Version [e.g. 1.5]

Do you plan on contributing?
Your response below will clarify whether the maintainers can expect you to fix the bug you reported.

  • Yes, I will create a Pull Request with the bugfix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant