reddit-crawl

Welcome to use REDDIT-CRAWL to crawl your own reddit.

With this tool, each data you crawled came from 3 types of url in https://www.reddit.com/:

Usage

For crawling the overview subreddit page:

python Reddit_crawler.py https://www.reddit.com/r/{subreddit} <stroage_folder>

For crawling the user submitted page:

python Reddit_user_crawler.py <username> <storage_folder>

or if you want to crawl a large amonut of data at once, follow here:

chmod +x RedditDigger*.sh
./RedditDigger.sh <subreddit_list> 
./RedditDiggerUsers.sh <user_list> [<name_expr_pattern>]

For example:

python Reddit_crawler.py https://www.reddit.com/r/jokes/ data/
python Reddit_user_crawler.py zzz0404 data/
./RedditDigger.sh reddit_subreddit.txt
./RedditDiggerUsers.sh reddit_users.txt ^zzz     # script will find the user name match '^zzz', ex: zzz0404

API

You can also use API directly in Reddit_crawler.py:

python
>>> from Reddit_crawler import *
>>>

It provides two classes for user, RedditObj and RedditUser, you can use help(RedditObj) help(RedditUser) to get more information:

>>> help(RedditObj)
>>> help(RedditUser)
>>>

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
README.md		README.md
RedditDigger.sh		RedditDigger.sh
RedditDiggerUsers.sh		RedditDiggerUsers.sh
Reddit_crawler.py		Reddit_crawler.py
Reddit_user_crawler.py		Reddit_user_crawler.py
crawl_comment.py		crawl_comment.py
fetch_comment.py		fetch_comment.py
merge.py		merge.py
reddit_subtitle.txt		reddit_subtitle.txt
reddit_users.txt		reddit_users.txt
reddit_users_1206.txt		reddit_users_1206.txt
reddit_users_1207.txt		reddit_users_1207.txt
reddit_users_1208.txt		reddit_users_1208.txt
reddit_users_1209.txt		reddit_users_1209.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reddit-crawl

Usage

For example:

API

About

Releases

Packages

Languages

BassyKuo/reddit-crawl

Folders and files

Latest commit

History

Repository files navigation

reddit-crawl

Usage

For example:

API

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages