Telegram bot, which search memes in inline mode. Includes web crawler for memepedia.ru and bot itself.
Made with pyTelegramBotAPI, Elasticsearch (Python Elasticsearch Client + elasticsearch-dsl) and flavored with Scrapy.
- Download and setup Elasticsearch (7.X branch).
- Create virtual environment.
- Install via
pip
packages listed inrequirements.txt
. - If you don't already have a Telegram bot, create one and obtain authorization token.
- Paste your Telegram bot's token in
bot/.env
file:
BOT_TOKEN="TOKEN:GOES_HERE"
Alternatively you can paste it in docker-compose.yaml
in environment
section for bot service (Docker only).
Bot and crawler (except Scrapy) settings are available in bot/settings.py
and crawler/crawler_settings.py
.
Replace example.com
with your domain:
STATIC_PROTO = "https://"
STATIC_DOMAIN = "example.com"
Elasticsearch connection settings are also available here:
ES_INDEX_NAME = "memeus"
ES_HOST = "localhost"
ES_PORT = 9200
ES_DOC_TYPE = "_doc"
You can also check Scrapy configuration (rate-limits, etc) on crawler/settings.py
or change DB connection settings on crawler/spider_bot.py
(SQLAlchemy).
Run this command to start crawler
in the top directory of project:
scrapy crawl memepedia
- On Linux
python bot/main.py
- On Windows
python bot\main.py
You can use Dockerfiles from this project for running Memeusbot project in Docker. Dockerfiles are available separately for all instances: bot, crawler and nginx for serving static files
Before running any container, you should create docker volume (in these examples named bot_storage
).
It will be mounted on every container to /var/www/memeusbot
:
docker volume create bot_storage
This is the simplest way to deploy the bot:
docker-compose up -d
Or you can start services separately via docker-compose run
.
However, in this scenario you should name elasticsearch
container as memeusbot_elastic
via commandline arguments.
Make Docker image via Dockerfile
and run container based on it:
docker image build -t memeusbot_bot bot/
docker container run -d --name memeusbot_bot -v bot_storage:/var/www/memeusbot memeusbot_bot
Make Docker image via Dockerfile
and run container based on it:
docker image build -t memeusbot_crawler crawler/
docker container run -d --name memeusbot_crawler -v bot_storage:/var/www/memeusbot memeusbot_crawler
Make Docker image via Dockerfile
and run container based on it:
docker image build -t memeusbot_elastic elasticsearch/
docker container run -d --name memeusbot_elastic -p 127.0.0.1:9200:9200 -e 'discovery.type=single-node' memeusbot_elastic
Nginx is optional and needed only if you want to serve images from your server.
By default it's configured to be used as proxy_pass
backend. So nginx on your host is required.
Make Docker image via Dockerfile
and run container based on it:
docker image build -t memeusbot_nginx nginx/
docker container run -d --name memeusbot_nginx -v bot_storage:/var/www/memeusbot -p 127.0.0.1:8081:80 memeusbot_nginx
Unfortunately on Telegram for Windows search results sometimes don't show with bot setting IS_SERVERLESS = False
.
+--- bot # bot directory
| +--- .env
| +--- main.py # Main bot script
| +--- requirements.txt # Python pip requirements
| +--- search.py # Searching in memeus ES index
| +--- settings.py # Bot's settings
+--- crawler # Scrapy crawler directory
| +--- crawler_settings.py # Crawler settings
| +--- importer.py # Importing data to memeus ES index
| +--- items.py
| +--- models.py # DB tables models
| +--- pipelines.py # Saving scrapped data
| +--- requirements.txt # Python pip requirements
| +--- settings.py # Scrapy settings
| +--- spiders
| | +--- memepedia.py # Spider for memepedia.ru
| +--- spider_db.py # DB config
+--- elasticsearch # Dockerfile and configs for dockered Elasticsearch
+--- nginx # Dockerfile and configs for dockered nginx
+--- www # WWW templates
| +--- index.html
+--- docker-compose.yaml # Docker-compose config file
+--- scrapy.cfg # Scrapy cfg-file