Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
lumyjuwon authored Jan 2, 2021
1 parent c49c779 commit b67bf84
Showing 1 changed file with 7 additions and 23 deletions.
30 changes: 7 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,9 @@
크롤링 가능한 기사 카테고리는 정치, 경제, 생활문화, IT과학, 사회, 세계, 오피니언입니다.
스포츠 기사같은 경우 해외야구, 해외축구, 한국야구, 한국축구, 농구, 배구, 골프, 일반 스포츠, e스포츠입니다.

## User Python Installation
* **KoreaNewsCrawler**

``` pip install KoreaNewsCrawler ```
## How to install
pip install KoreaNewsCrawler

## Method

* **set_category(category_name)**
Expand All @@ -25,7 +24,7 @@

이 메서드는 크롤링 실행 메서드입니다.

## Example
## Article News Crawler Example
```
from korea_news_crawler.articlecrawler import ArticleCrawler
Expand All @@ -48,13 +47,6 @@ Spt_crawler.start()
```
2017년 1월 ~ 2018년 4월까지 한국야구, 한국축구 뉴스를 멀티프로세서를 이용하여 병렬 크롤링을 진행합니다.


## Multi Process 안내
intel i5 9600 cpu로 테스트 해본 결과 1개의 카테고리 당 평균 **8%** 의 cpu 점유율을 보였습니다.
크롤러를 실행하는 컴퓨터 사양에 맞게 카테고리 개수를 맞추시거나 반복문을 이용하시기 바랍니다.

![ex_screenshot](./img/multi_process.PNG)

## Results
![ex_screenshot](./img/article_result.PNG)
![ex_screenshot](./img/sport_resultimg.PNG)
Expand All @@ -77,10 +69,8 @@ In the case of sports articles, that include korea baseball, korea soccer, world
**In the case of sports articles, you can't use sport article crawler because html form is changed. I will update sport article crawler
as soon as possible.**

## User Python Installation
* **KoreaNewsCrawler**

``` pip install KoreaNewsCrawler ```
## How to install
pip install KoreaNewsCrawler

## Method

Expand All @@ -99,7 +89,7 @@ as soon as possible.**

This method is the crawl execution method.

## Example
## Article News Crawler Example
```
from korea_news_crawler.articlecrawler import ArticleCrawler
Expand All @@ -121,12 +111,6 @@ Spt_crawler.set_date_range(2017,1,2018,4)
Spt_crawler.start()
```
From January 2017 to April 2018, Parallel crawls will be conducted using multiprocessors for korea baseball, and korea soccer category news.

## Multi Process Information
Testing with intel i5 9600 cpu showed an average ** 8% ** cpu share per category.
Please adjust the number of categories to match the specifications of the computer running the crawler, or use a loop.

![ex_screenshot](./img/multi_process.PNG)

## Results
![ex_screenshot](./img/article_result.PNG)
Expand Down

0 comments on commit b67bf84

Please sign in to comment.