Skip to content

Commit 1595917

Browse files
author
github
committed
readme
0 parents  commit 1595917

File tree

181 files changed

+81052
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

181 files changed

+81052
-0
lines changed

Errorchcker_config.yml

+108
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
!!python/object/new:easydict.EasyDict
2+
dictitems:
3+
dataset: &id002 !!python/object/new:easydict.EasyDict
4+
dictitems:
5+
id_end: 1
6+
id_pad: 3
7+
id_start: 0
8+
id_unk: 2
9+
prepared_folder: &id001
10+
- ./data/errorchecker_dataset/prepared
11+
vocabulary_file: ./data/errorchecker_dataset/prepared/properties.npy
12+
state:
13+
id_end: 1
14+
id_pad: 3
15+
id_start: 0
16+
id_unk: 2
17+
prepared_folder: *id001
18+
vocabulary_file: ./data/errorchecker_dataset/prepared/properties.npy
19+
model: &id003 !!python/object/new:easydict.EasyDict
20+
dictitems:
21+
MaxPredictLength: 200
22+
att_dim: 256
23+
batch_size: 16
24+
beam_size: 5
25+
ckpt_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/ckpt
26+
ckpt_name: ErrorCheck
27+
clip_value: 5
28+
decoding: beams_search
29+
display_iter: 100
30+
div_gamma: 1
31+
div_prob: 0
32+
droupout: 0.3
33+
errche_decoder_name: DecoderAtt_errche
34+
errche_embeding_dims_source: 128
35+
errche_embeding_dims_target: 128
36+
errche_encoder_name: Encode_errche
37+
errche_encoder_type: Prenet
38+
errche_rnn_decoder_dim: 256
39+
errche_rnn_encoder_dim: 128
40+
eval_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/eval
41+
gpu_fraction: 0.48
42+
learning_decay_rate: 0.94
43+
learning_decay_step: 8000
44+
learning_init: 0.1
45+
learning_type: exponential
46+
log_dir: /home/xiaofeng/code/image2katex/log
47+
log_file_name: ErrorChecker.log
48+
log_name: ErrorChecker
49+
metric_val: perplexity
50+
model_saved: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet
51+
n_epochs: 1000
52+
optimizer: momentum
53+
save_iter: 500
54+
summary_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/summary
55+
test_batch_size: 1
56+
state:
57+
MaxPredictLength: 200
58+
att_dim: 256
59+
batch_size: 16
60+
beam_size: 5
61+
ckpt_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/ckpt
62+
ckpt_name: ErrorCheck
63+
clip_value: 5
64+
decoding: beams_search
65+
display_iter: 100
66+
div_gamma: 1
67+
div_prob: 0
68+
droupout: 0.3
69+
errche_decoder_name: DecoderAtt_errche
70+
errche_embeding_dims_source: 128
71+
errche_embeding_dims_target: 128
72+
errche_encoder_name: Encode_errche
73+
errche_encoder_type: Prenet
74+
errche_rnn_decoder_dim: 256
75+
errche_rnn_encoder_dim: 128
76+
eval_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/eval
77+
gpu_fraction: 0.48
78+
learning_decay_rate: 0.94
79+
learning_decay_step: 8000
80+
learning_init: 0.1
81+
learning_type: exponential
82+
log_dir: /home/xiaofeng/code/image2katex/log
83+
log_file_name: ErrorChecker.log
84+
log_name: ErrorChecker
85+
metric_val: perplexity
86+
model_saved: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet
87+
n_epochs: 1000
88+
optimizer: momentum
89+
save_iter: 500
90+
summary_dir: /home/xiaofeng/data/image2latex/ErrorCheck/model_saved/Prenet/summary
91+
test_batch_size: 1
92+
predict: &id004 !!python/object/new:easydict.EasyDict
93+
dictitems:
94+
npy_path: ./static/npy
95+
preprocess_dir: ./static/preprocess
96+
render_path: ./static/render
97+
temp_path: ./static
98+
web_path: ./templates
99+
state:
100+
npy_path: ./static/npy
101+
preprocess_dir: ./static/preprocess
102+
render_path: ./static/render
103+
temp_path: ./static
104+
web_path: ./templates
105+
state:
106+
dataset: *id002
107+
model: *id003
108+
predict: *id004

Im2Katex_config.yml

+115
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
!!python/object/new:easydict.EasyDict
2+
dictitems:
3+
dataset: &id003 !!python/object/new:easydict.EasyDict
4+
dictitems:
5+
id_end: 1
6+
id_pad: 3
7+
id_start: 0
8+
id_unk: 2
9+
image_folder: &id001
10+
- /home/xiaofeng/data/image2latex/handwritten/process/img_padding
11+
- /home/xiaofeng/data/image2latex/original/process/img_padding
12+
prepared_folder: &id002
13+
- ./data/im2latex_dataset/merged/prepared/handwritten/
14+
- ./data/im2latex_dataset/merged/prepared/original/
15+
vocabulary_file: ./data/im2latex_dataset/merged/prepared/handwritten/properties.npy
16+
state:
17+
id_end: 1
18+
id_pad: 3
19+
id_start: 0
20+
id_unk: 2
21+
image_folder: *id001
22+
prepared_folder: *id002
23+
vocabulary_file: ./data/im2latex_dataset/merged/prepared/handwritten/properties.npy
24+
model: &id004 !!python/object/new:easydict.EasyDict
25+
dictitems:
26+
MaxPredictLength: 200
27+
att_dim: 512
28+
batch_size: 16
29+
beam_size: 5
30+
ckpt_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/ckpt
31+
ckpt_name: seq2seqAtt
32+
clip_value: 5
33+
decoder_name: DecoderAtt
34+
decoding: beams_search
35+
display_iter: 100
36+
div_gamma: 1
37+
div_prob: 0
38+
droupout: 0.3
39+
embeding_dims: 80
40+
encoder_cnn: vanilla
41+
encoder_name: Encode
42+
encoder_type: conv
43+
eval_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/eval
44+
gpu_fraction: 0.48
45+
learning_decay_rate: 0.94
46+
learning_decay_step: 8000
47+
learning_init: 0.1
48+
learning_type: exponential
49+
log_dir: /home/xiaofeng/code/image2katex/log
50+
log_file_name: Im2Katex.log
51+
log_name: Im2Katex
52+
metric_val: perplexity
53+
model_saved: /home/xiaofeng/data/image2latex/merged/model_saved/conv
54+
n_epochs: 1000
55+
optimizer: momentum
56+
positional_embeddings: true
57+
rnn_decoder_dim: 512
58+
rnn_encoder_dim: 256
59+
save_iter: 500
60+
summary_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/summary
61+
test_batch_size: 1
62+
state:
63+
MaxPredictLength: 200
64+
att_dim: 512
65+
batch_size: 16
66+
beam_size: 5
67+
ckpt_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/ckpt
68+
ckpt_name: seq2seqAtt
69+
clip_value: 5
70+
decoder_name: DecoderAtt
71+
decoding: beams_search
72+
display_iter: 100
73+
div_gamma: 1
74+
div_prob: 0
75+
droupout: 0.3
76+
embeding_dims: 80
77+
encoder_cnn: vanilla
78+
encoder_name: Encode
79+
encoder_type: conv
80+
eval_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/eval
81+
gpu_fraction: 0.48
82+
learning_decay_rate: 0.94
83+
learning_decay_step: 8000
84+
learning_init: 0.1
85+
learning_type: exponential
86+
log_dir: /home/xiaofeng/code/image2katex/log
87+
log_file_name: Im2Katex.log
88+
log_name: Im2Katex
89+
metric_val: perplexity
90+
model_saved: /home/xiaofeng/data/image2latex/merged/model_saved/conv
91+
n_epochs: 1000
92+
optimizer: momentum
93+
positional_embeddings: true
94+
rnn_decoder_dim: 512
95+
rnn_encoder_dim: 256
96+
save_iter: 500
97+
summary_dir: /home/xiaofeng/data/image2latex/merged/model_saved/conv/summary
98+
test_batch_size: 1
99+
predict: &id005 !!python/object/new:easydict.EasyDict
100+
dictitems:
101+
npy_path: ./static/npy
102+
preprocess_dir: ./static/preprocess
103+
render_path: ./static/render
104+
temp_path: ./static
105+
web_path: ./templates
106+
state:
107+
npy_path: ./static/npy
108+
preprocess_dir: ./static/preprocess
109+
render_path: ./static/render
110+
temp_path: ./static
111+
web_path: ./templates
112+
state:
113+
dataset: *id003
114+
model: *id004
115+
predict: *id005

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
(c) 2019 xiaofeng
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README_en.md

+120
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Convert Img To Katex
2+
3+
## Abstract
4+
5+
**Implement an attention model that takes an image of a PDF math formula, and outputs the characters of the LaTeX source that generates the formula.**
6+
7+
This is a tensorflow implementation of the HarvardNLP paper: What You Get Is What You See: A Visual Markup Decompiler.
8+
The model graphic is here:
9+
<p align="center"><img src="http://lstm.seas.harvard.edu/latex/network.png" width="300"></p>
10+
11+
An example input is a rendered LaTeX formula:
12+
13+
<p align="center"><img src="http://lstm.seas.harvard.edu/latex/results/website/images/119b93a445-orig.png"></p>
14+
15+
The goal is to infer the LaTeX formula that can render such an image:
16+
17+
```
18+
d s _ { 1 1 } ^ { 2 } = d x ^ { + } d x ^ { - } + l _ { p } ^ { 9 } \frac { p _ { - } } { r ^ { 7 } } \delta ( x ^ { - } ) d x ^ { - } d x ^ { - } + d x _ { 1 } ^ { 2 } + \; \cdots \; + d x _ { 9 } ^ { 2 }
19+
```
20+
21+
## Prerequsites
22+
23+
`Most of the code is written in tensorflow, with Python for preprocessing.`
24+
25+
#### Preprocess
26+
27+
The proprocessing for this dataset is exactly reproduced as the original torch implementation by the HarvardNLP group
28+
29+
Python
30+
31+
- Pillow
32+
- numpy
33+
34+
Optional: We use Node.js and KaTeX for preprocessing [Installation](https://nodejs.org/en/)
35+
36+
##### pdflatex [Installaton](https://www.tug.org/texlive/)
37+
38+
Pdflatex is used for rendering LaTex during evaluation.
39+
40+
##### ImageMagick convert [Installation](http://www.imagemagick.org/script/index.php)
41+
42+
Convert is used for rending LaTex during evaluation.
43+
44+
- linux `sudo apt install imagemagick`
45+
- linux setup webpage
46+
- https://imagemagick.org/script/install-source.php
47+
- Mac `brew install imagemagick`
48+
49+
##### Webkit2png [Installation](http://www.paulhammond.org/webkit2png/)
50+
51+
Webkit2png is used for rendering HTML during evaluation.
52+
53+
## Make the dataset with own data
54+
Code directionart:
55+
```
56+
cd data
57+
```
58+
59+
For more details, see the readme.md in this folder
60+
Once the dataset is ready, saved them as the **npy** format:
61+
`train_buckets.npy, valid_buckets.npy, test_buckets.npy can be generated using the **build_imglatex_data.py** script`
62+
63+
## Train
64+
65+
```
66+
python3 train_model.py
67+
```
68+
Default hyperparameters used:
69+
70+
* BATCH_SIZE = 32
71+
* EMB_DIM = 80
72+
* ENC_DIM = 256
73+
* DEC_DIM = ENC_DIM*2
74+
* D = 512 (**channels in feature grid**)
75+
* V=len(vocab)+3 = (vocab size)+3
76+
* NB_EPOCHS = 50
77+
* H = 20 (Maximum height of feature grid)
78+
* W = 50 (Maximum width of feature grid)
79+
80+
The train NLL drops to 0.08 after 18 epochs of training on 24GB Nvidia M40 GPU.
81+
82+
## Test
83+
* python3 predict_to_img.py
84+
85+
86+
## Evaluate
87+
88+
attention.py scores the train set and validation set after each epoch (measures mean train NLL, perplexity)
89+
90+
#### Scores from this implementation
91+
92+
![results_1](results_1.png)
93+
![results_2](results_2.png)
94+
95+
## Dataset
96+
97+
- Printed style https://zenodo.org/record/56198#.XA4GjfYzZZj
98+
- handwriting http://lstm.seas.harvard.edu/latex/data/
99+
100+
## Weight files
101+
102+
[Google Drive](https://drive.google.com/drive/folders/0BwbIUfIM1M8sc0tEMGk1NGlKZTA?usp=sharing)
103+
104+
## Details of this package
105+
106+
- `backup_predict_to_img.py` 原始仓库网络结构测试程序
107+
108+
## Reference
109+
110+
* **OpenAI’s Requests For Research Problem**[Open AI-question source](https://openai.com/requests-for-research/#im2)
111+
*
112+
* [Official resolution](http://lstm.seas.harvard.edu/latex/)
113+
* [Official repo-torch](https://github.com/harvardnlp/im2markup)
114+
* [Source paper](https://arxiv.org/pdf/1609.04938v1.pdf)
115+
* [Seq2Seq for LaTeX generation](https://guillaumegenthial.github.io/image-to-latex.html)
116+
* [Original model repo-网络模型TF](https://github.com/ritheshkumar95/im2latex-tensorflow)
117+
* [Another model repo--网络模型TF](https://github.com/baoblackcoal/RFR-solution)
118+
* [知乎解释](https://zhuanlan.zhihu.com/p/25031185)
119+
* [Dataset ori repo-数据集制作](https://github.com/Miffyli/im2latex-dataset)
120+

0 commit comments

Comments
 (0)