We often need to collect a lot of data at our initial stages to work for other parts of the project. this is a web crawler for www.yelp.com it collects reviews of restaurants near new york, nowadays, the websites often are dynamic in nature for which you must automate the task required.
- python-programming language
- selenium - to automate the tasks
- beautiful soup- to parse the response code
-
you need to install all the libraries used with their dependencies. you can use pip installer for that. just type: pip install selenium pip install bs4 etc
-
you need to download the chrome driver for selenium to work and paste the path in the source code.
Note: please not that this code is just for restuarants in newyork so if you want to scrap data for other things in the website you need to edit the loop in the given code which takes care of hotel_urls.