Test exercise on data retrieval using URLs

Task

Task is to read the content of excel file, retrieve the name and url of the items. Based on the obtained urls, it is necessary to determine the code and price of the item from Internet. Finally, the results should be stored in the output excel file.

Description

In order to accomplish a task, it was divided into several sub-tasks, such as module for reading and writing excel files, web scrapping, and determining the header locations from excel content. For this purpose, selenium and pandas libraries were utilized.

Installation

To install all necessary dependencies, it is important to run "pip install -r requirements.txt" command line in root directory. Afterwards, "main.py" file can be executed.

Usage

The code can be used for identifying the information about the items from Internet and writing the obtained results into excel sheet.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
.gitignore		.gitignore
README.md		README.md
code_price_scrapper.py		code_price_scrapper.py
excel_reader.py		excel_reader.py
excel_writer.py		excel_writer.py
header_coord_extractor.py		header_coord_extractor.py
main.py		main.py
object_creator.py		object_creator.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test exercise on data retrieval using URLs

Task

Description

Installation

Usage

About

Releases

Packages

Languages

yerbol-akhmetov/web_scraping_and_excel_management

Folders and files

Latest commit

History

Repository files navigation

Test exercise on data retrieval using URLs

Task

Description

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages