Skip to content

Commit a1e9477

Browse files
authored
Release 1.0.2
2 parents 49e74be + 9ff98b6 commit a1e9477

File tree

6 files changed

+178
-31
lines changed

6 files changed

+178
-31
lines changed

README.md

+3
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,9 @@ from deeppavlov import evaluate_model
181181
model = evaluate_model(<config_path>, install=True, download=True)
182182
```
183183

184+
DeepPavlov also [allows](https://docs.deeppavlov.ai/en/master/features/python.html) to build a model from components for
185+
inference using Python.
186+
184187
## License
185188

186189
DeepPavlov is Apache 2.0 - licensed.

deeppavlov/_meta.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
__version__ = '1.0.1'
1+
__version__ = '1.0.2'
22
__author__ = 'Neural Networks and Deep Learning lab, MIPT'
33
__description__ = 'An open source library for building end-to-end dialog systems and training chatbots.'
44
__keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot']

docs/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Welcome to DeepPavlov's documentation!
99
QuickStart <intro/quick_start>
1010
General concepts <intro/overview>
1111
Configuration file <intro/configuration>
12+
Python pipelines <intro/python.ipynb>
1213
Models overview <features/overview>
1314

1415

docs/intro/python.ipynb

+141
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "6d5cd16b",
6+
"metadata": {},
7+
"source": [
8+
"#### Python pipelines"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "da10fd80",
14+
"metadata": {},
15+
"source": [
16+
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deeppavlov/DeepPavlov/blob/master/docs/intro/python.ipynb)"
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"id": "d55ebe35",
22+
"metadata": {},
23+
"source": [
24+
"Python models could be used without .json configuration files.\n",
25+
"\n",
26+
"The code below is an alternative to building [insults_kaggle_bert](https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/classifiers/insults_kaggle_bert.json) model and using it with\n",
27+
"\n",
28+
"```python\n",
29+
"from deeppavlov import build_model\n",
30+
"\n",
31+
"model = build_model('insults_kaggle_bert', download=True)\n",
32+
"```"
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"id": "fa1db63b",
38+
"metadata": {},
39+
"source": [
40+
"At first, define variables for model components and download model data."
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"id": "9d6671e2",
47+
"metadata": {},
48+
"outputs": [],
49+
"source": [
50+
"from deeppavlov.core.commands.utils import expand_path\n",
51+
"from deeppavlov.download import download_resource\n",
52+
"\n",
53+
"\n",
54+
"classifiers_path = expand_path('~/.deeppavlov/models/classifiers')\n",
55+
"model_path = classifiers_path / 'insults_kaggle_torch_bert'\n",
56+
"transformer_name = 'bert-base-uncased'\n",
57+
"\n",
58+
"download_resource(\n",
59+
" 'http://files.deeppavlov.ai/deeppavlov_data/classifiers/insults_kaggle_torch_bert_v5.tar.gz',\n",
60+
" {classifiers_path}\n",
61+
")\n"
62+
]
63+
},
64+
{
65+
"cell_type": "markdown",
66+
"id": "332d644e",
67+
"metadata": {},
68+
"source": [
69+
"Then, initialize model components."
70+
]
71+
},
72+
{
73+
"cell_type": "code",
74+
"execution_count": null,
75+
"id": "809c31ad",
76+
"metadata": {},
77+
"outputs": [],
78+
"source": [
79+
"from deeppavlov.core.data.simple_vocab import SimpleVocabulary\n",
80+
"from deeppavlov.models.classifiers.proba2labels import Proba2Labels\n",
81+
"from deeppavlov.models.preprocessors.torch_transformers_preprocessor import TorchTransformersPreprocessor\n",
82+
"from deeppavlov.models.torch_bert.torch_transformers_classifier import TorchTransformersClassifierModel\n",
83+
"\n",
84+
"\n",
85+
"preprocessor = TorchTransformersPreprocessor(\n",
86+
" vocab_file=transformer_name,\n",
87+
" max_seq_length=64\n",
88+
")\n",
89+
"\n",
90+
"classes_vocab = SimpleVocabulary(\n",
91+
" load_path=model_path/'classes.dict',\n",
92+
" save_path=model_path/'classes.dict'\n",
93+
")\n",
94+
"\n",
95+
"classifier = TorchTransformersClassifierModel(\n",
96+
" n_classes=classes_vocab.len,\n",
97+
" return_probas=True,\n",
98+
" pretrained_bert=transformer_name,\n",
99+
" save_path=model_path/'model',\n",
100+
" optimizer_parameters={'lr': 1e-05}\n",
101+
")\n",
102+
"\n",
103+
"proba2labels = Proba2Labels(max_proba=True)"
104+
]
105+
},
106+
{
107+
"cell_type": "markdown",
108+
"id": "87e8ec20",
109+
"metadata": {},
110+
"source": [
111+
"Finally, create model from components. ``Element`` is a wrapper for a component. ``Element`` receives the component and the names of the incoming and outgoing arguments. ``Model`` combines ``Element``s into pipeline."
112+
]
113+
},
114+
{
115+
"cell_type": "code",
116+
"execution_count": null,
117+
"id": "acfe29de",
118+
"metadata": {},
119+
"outputs": [],
120+
"source": [
121+
"from deeppavlov import Element, Model\n",
122+
"\n",
123+
"model = Model(\n",
124+
" x=['x'],\n",
125+
" out=['y_pred_labels'],\n",
126+
" pipe=[\n",
127+
" Element(component=preprocessor, x=['x'], out=['bert_features']),\n",
128+
" Element(component=classifier, x=['bert_features'], out=['y_pred_probas']),\n",
129+
" Element(component=proba2labels, x=['y_pred_probas'], out=['y_pred_ids']),\n",
130+
" Element(component=classes_vocab, x=['y_pred_ids'], out=['y_pred_labels'])\n",
131+
" ]\n",
132+
")\n",
133+
"\n",
134+
"model(['you are stupid', 'you are smart'])"
135+
]
136+
}
137+
],
138+
"metadata": {},
139+
"nbformat": 4,
140+
"nbformat_minor": 5
141+
}

requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ aio-pika>=3.2.2,<6.9.0
22
fastapi>=0.47.0,<0.78.0
33
filelock>=3.0.0,<3.8.0
44
nltk>=3.2.5,<3.8.0
5-
numpy
5+
numpy<1.24
66
overrides==4.1.2
77
pandas>=1.0.0,<1.5.0
88
prometheus-client>=0.13.0,<0.15.0

utils/prepare/upload.py

+31-29
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,7 @@
1313
# limitations under the License.
1414

1515
import argparse
16-
import os
17-
import shutil
16+
import pathlib
1817
import tarfile
1918
from pathlib import Path
2019

@@ -23,45 +22,48 @@
2322
from hashes import main
2423

2524

26-
def upload(config_in_file):
25+
def upload(config_in_file: str, tar_name: str, tar_output_dir: Path):
26+
if not tar_output_dir.exists():
27+
raise RuntimeError(f'A folder {tar_output_dir} does not exist')
28+
29+
print(f'Config: {config_in_file}')
30+
if not Path(config_in_file).exists():
31+
raise RuntimeError(f'A config {config_in_file} does not exist')
2732

28-
print(config_in_file)
2933
config_in = parse_config(config_in_file)
3034
config_in_file = find_config(config_in_file)
3135

3236
model_path = Path(config_in['metadata']['variables']['MODEL_PATH']).expanduser()
33-
models_path = Path(config_in['metadata']['variables']['MODELS_PATH']).expanduser()
3437
model_name, class_name = config_in_file.stem, config_in_file.parent.name
35-
36-
if str(model_name) not in str(model_path):
37-
raise(f'{model_name} is not the path of the {model_path}')
38-
39-
arcname = str(model_path).split("models/")[1]
40-
tar_path = models_path/model_name
41-
tmp_folder = f'/tmp/'
42-
tmp_tar = tmp_folder + f'{model_name}.tar.gz'
4338

44-
print("model_path", model_path)
45-
print("class_name", class_name)
46-
print("model_name", model_name)
47-
48-
print("Start tarring")
49-
archive = tarfile.open(tmp_tar, "w|gz")
50-
archive.add(model_path, arcname=arcname)
51-
archive.close()
39+
if tar_name is None:
40+
tar_name = f'{model_name}'
41+
print(f'tar_name set to {tar_name}')
42+
43+
full_tar_name = tar_output_dir / f'{tar_name}.tar.gz'
44+
if Path(full_tar_name).exists():
45+
raise RuntimeError(f'An archive {Path(full_tar_name)} already exists')
46+
47+
print(f'model_path: {model_path}')
48+
print(f'class_name: {class_name}')
49+
print(f'model_name: {model_name}')
50+
print(f'Start tarring to {full_tar_name}')
51+
with tarfile.open(str(full_tar_name), "w|gz") as archive:
52+
archive.add(model_path, arcname=pathlib.os.sep)
53+
5254
print("Stop tarring")
55+
print(f'Tar archive: {Path(full_tar_name)} has been created')
5356

5457
print("Calculating hash")
55-
main(tmp_tar)
56-
57-
print("tmp_tar", tmp_tar)
58-
command = f'scp -r {tmp_folder}{model_name}* share.ipavlov.mipt.ru:/home/export/v1/{class_name}'
59-
donwload_url = f'http://files.deeppavlov.ai/v1/{class_name}/{model_name}.tar.gz'
60-
print(command, donwload_url, sep='\n')
58+
main(full_tar_name)
6159

6260

6361
if __name__ == '__main__':
6462
parser = argparse.ArgumentParser()
65-
parser.add_argument("config_in", help="path to a config", type=str)
63+
parser.add_argument('-c', '--config_in', help='path to a config', type=str)
64+
parser.add_argument('-n', '--tar_name', help='name of the tar archive (without tar.gz extension)',
65+
default=None, required=False, type=str)
66+
parser.add_argument('-o', '--tar_output_dir', help='dir to save a tar archive', default='./',
67+
required=False, type=Path)
6668
args = parser.parse_args()
67-
upload(args.config_in)
69+
upload(args.config_in, args.tar_name, args.tar_output_dir)

0 commit comments

Comments
 (0)