Skip to content

The project reads URLs of items from excel spreadsheet and extract information about the product from the web-site.

Notifications You must be signed in to change notification settings

yerbol-akhmetov/web_scraping_and_excel_management

Repository files navigation

Test exercise on data retrieval using URLs


Task

Task is to read the content of excel file, retrieve the name and url of the items. Based on the obtained urls, it is necessary to determine the code and price of the item from Internet. Finally, the results should be stored in the output excel file.

Description

In order to accomplish a task, it was divided into several sub-tasks, such as module for reading and writing excel files, web scrapping, and determining the header locations from excel content. For this purpose, selenium and pandas libraries were utilized.

Installation

To install all necessary dependencies, it is important to run "pip install -r requirements.txt" command line in root directory. Afterwards, "main.py" file can be executed.

Usage

The code can be used for identifying the information about the items from Internet and writing the obtained results into excel sheet.

About

The project reads URLs of items from excel spreadsheet and extract information about the product from the web-site.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages