Alibaba selenium crawler
- Python 3.6
- geckodriver
- Firefox
- Install python. Create python folder and python_folder/Scripts to PATH environment variable (you can do it with enabling "Add Python 3.6 to PATH" checkbox while python installation).
- Download geckodriver https://github.com/mozilla/geckodriver/releases and unzip it.
- Install Firefox.
- Run CMD and go to crawler folder
cd C:/path/to/alibaba-crawler
. Runpip install -r requirements.txt
- Edit config.ini file
[CRAWLER]
section. Setdriver_path = <path to your geckodriver.exe>
(You can just copy and paste geckodriver.exe into crawler root folder, so you don't need to edit driver_path field in config.ini). If you want to use proxy, setproxy = https://host:port
. - Edit config.ini file
[MAIN]
section.min_price
,max_price
- filter params.input
,output
- input and output files. Select category atcategory
section. If category is empty, then all products will be crawled.
python app.py