Skip to content

balancy/parse_library

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Library parser

Script allows parsing Sci Fi books from tululu.org. It parses book title, author, genres, image, comments and text if they are present.

How to install

Python3 and Git should be already installed.

  1. Clone the repository by command:
git clone https://github.com/balancy/parse_library
  1. Go inside cloned repository and create virtual environment by command:
python -m venv env
  1. Activate virtual environment. For linux-based OS:
source env/bin/activate

    For Windows:

env\scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

How to use

General way to use it is via command:

python main.py

Complete list of script arguments :

--start_page start

    where start is the page to start downloads books from. Default value is 1.

--end_page end

    where end is the page to finish download books at. Default value is the last page in category.

--books_folder folder

    where folder is the folder to save text versions of books. By default, folder is 'books/'

--imgs_folder folder

    where folder is the folder to save cover images of books. By default, folder is 'images/'

--json_path folder

    where folder is the folder to save all downloaded library info in JSON format. By default, folder is root folder.

--skip_images

    If this argument given (flag enabled), then script will skip books cover images downloading.

--skip_txt

    If this argument given (flag enabled), then script will skip books text versions downloading.

You can always see the help how to use script by command:

python main.py -h

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages