Skip to content

akshayphilar/scrapy-kafka

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scrapy-kafka

Kafka-based components for Scrapy. There are 2 components:

  • A custom Spider that waits for HTML Responses to crawl via a Kafka topic. When there are no more messages to read for the topic, the Spider just stays idle.
  • A custom ItemPipeline component that stores a JSON-ified Item back into another Kafka topic.

Please see the example directory for how to use this.

Contributors

Contributors to scrapy-kafka, listed alphabetically:

About

Kafka-based components for Scrapy

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%