IIIF

Web scrapping Repository

https://github.com/hicala/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Overview

Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Check the Scrapy homepage at https://scrapy.org for more information, including a list of features.
https://github.com/hicala/autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Overview

This project is made for automatic web scraping to make scraping easy. It gets a url or the html content of a web page and a list of sample data which we want to scrape from that page. This data can be text, url or any html tag value of that page. It learns the scraping rules and returns the similar elements. Then you can use this learned object with new urls to get similar content or the exact same element of those new pages.
https://github.com/hicala/scraping-workshop

[Spanish] Scraping workshop: documentación y scripts

Overview

Taller de extracción automatizada de datos de páginas web

Web scraping es una técnica que emplea diferentes tecnologías para extraer datos o información de una página web. Se usa para recoger datos sin estructura y convertirlos en datos estructurados para posteriormente ser tratados en bases de datos u hojas de cálculo. El taller es una aproximación práctica al scraping con el objetivo de permitir a los asistentes el tratamiento de información útil para sus propios proyectos.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IIIF

Overview

Overview

Overview

Clone this wiki locally