Skip to content

UrbanDictionary multi-threaded \ asynchronous crawler

Notifications You must be signed in to change notification settings

guzdy/UrbanDictSpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UrbanDictionarySpider

这是 UrbanDictionary 爬取软件。 通过对之前保存到数据库的ProxyPool的合理利用,避免爬取数据过程中的IP被封等影响效率的部分。 使用Redis进行 Proxy Queue ; 使用多线程及 Asyncio 来保证爬取速度。

数据爬取完成(8万多个Document, 1个半小时),但代码细节还待优化。

About

UrbanDictionary multi-threaded \ asynchronous crawler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published