-
Notifications
You must be signed in to change notification settings - Fork 0
Web scrapping
Web scrapping Repository
-
https://github.com/hicala/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
Check the Scrapy homepage at https://scrapy.org for more information, including a list of features.
-
https://github.com/hicala/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
This project is made for automatic web scraping to make scraping easy. It gets a url or the html content of a web page and a list of sample data which we want to scrape from that page. This data can be text, url or any html tag value of that page. It learns the scraping rules and returns the similar elements. Then you can use this learned object with new urls to get similar content or the exact same element of those new pages.
-
https://github.com/hicala/scraping-workshop
[Spanish] Scraping workshop: documentación y scripts
Taller de extracción automatizada de datos de páginas web
Web scraping es una técnica que emplea diferentes tecnologías para extraer datos o información de una página web. Se usa para recoger datos sin estructura y convertirlos en datos estructurados para posteriormente ser tratados en bases de datos u hojas de cálculo. El taller es una aproximación práctica al scraping con el objetivo de permitir a los asistentes el tratamiento de información útil para sus propios proyectos.
© Hila Calderon 2020