Skip to content

A tutorial on data collecting and web scrapping for financial news site, as part 1 of an NLP pipline series

Notifications You must be signed in to change notification settings

Elamraoui-Sohayb/Ethical_Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Ethical_Scrapper:

A jupyter notebook tutorial on data collecting and web scrapping for financial news site, as part1 of a NLP pipline series

  1. Ethical Scrapping:

  2. Efficent Scrapping:

  3. Pre-Code Analysis:

    1. Examining the Source
    2. Examining the HTML
  4. Code:

    1. Envirenment and Setup
    2. Imports
    3. Making a request to a single page
    4. Code Structure
    5. Getting the details of a single Article
    6. Getting the details of a single Page: (list of Articles)
    7. Saving to CSV
    8. Looping over the Pages of the Category: (the General function)
  5. Checking the resulting dataset

  6. Future Improvements

  7. Up next: Starting our NLP pipline for this dataset

  8. Ressources

About

A tutorial on data collecting and web scrapping for financial news site, as part 1 of an NLP pipline series

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages