Skip to content

Scrape Yelp utilising Python, Selenium & Residential Proxies

Notifications You must be signed in to change notification settings

Smartproxy/yelp-selenium-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

## Dependencies
BeautifulSoup
webdriver_manager
selenium
extension >> extension.py

Authentication

You can create, edit, and delete proxy users in our Dashboard > Residential > Proxy setup page.

Yelp Selenium Scraper

This code is a script in Python that uses the selenium and BeautifulSoup libraries to scrape listings from Yelp.

The script uses the Chrome web browser, controlled by the selenium library, to navigate to the website and retrieve its source code. The source code is then parsed using BeautifulSoup to extract specific information about each product on the page.

In order to use a proxy, the code uses the "webdriver_manager" and "extension" libraries to install the chrome driver and configure the Chrome browser to use a proxy server. The credentials for the proxy server, username, password, endpoint, and port are passed as arguments to the "proxies" function from the "extension" library (extension.py).

The script then uses BeautifulSoup to search the page source for specific tags containing the listing information. The listing information is then stored in a dictionary and added to a list "data".

Finally, the script saves the "data" list to a JSON file named "data.json".

About

Scrape Yelp utilising Python, Selenium & Residential Proxies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages