Inference Issue #4

rahulhanotDTU · 2023-06-10T13:40:10Z

This issue is coming during inference phase of this model for every image

File "/home/mepluser1/rahul_hanot/try_new/DocDiff/model/DocDiff.py", line 315, in forward
x = torch.cat((x, s), dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 76 but got size 75 for tensor number 1 in the list.
I

Royalvice · 2023-06-10T13:44:20Z

Hello, as mentioned in Guide in README.EN.md, please make sure that the width and height of the input image are both multiples of 8. If not, please pad. Hope helpful.

rahulhanotDTU · 2023-06-12T06:00:00Z

Can we use DPM solver true for inference for better results as we are not getting good results as mentioned in your paper also can you tell me how many steps the model is trained and whose pre-trained model weighs are provided in the Git repo

Royalvice · 2023-06-12T07:30:33Z

Our DocDiff model was trained with 100 time steps. Using DPM solver does not improve the results. Currently, we have not uploaded the jump-step sampling based on DDIM (but you can modify it based on the paper; just save x_0 for each step in the sampling process). Hence, please use 100 steps for inference, otherwise it may result in strange outputs due to the incompatibility between noise intensity and T. As for not getting the expected results as described in the paper, we suspect that it might be caused by input data. We will upload some demo images and inference demo notebooks soon for easy result reproduction. If you still cannot obtain the desired results, you can download the deblurring dataset from http://www.fit.vutbr.cz/~ihradis/CNN-Deblur/ and simply split the images into original and ground truth at 128*128 resolution with a batchsize of 32 for training (or train on your own datasets). This only requires 12GB GPU memory and takes only one and a half days to iterate for 1 million steps on a 3090 GPU. We hope this helps.

Royalvice · 2023-06-12T07:32:27Z

Moreover, during training, you can track the current performance of your model with the images located in the "./Training" folder. Usually, at around 10,000 steps, the model starts producing reasonable outputs. After 100,000 steps, the performance becomes more stable, and the model converges at around 1 million steps.

rahulhanotDTU · 2023-06-12T09:01:35Z

I am using 100 steps for inference, can you please upload the inference notebook ASAP and also demo images

Royalvice · 2023-06-12T09:03:20Z

For sure

rahulhanotDTU · 2023-06-12T09:23:42Z

Do I need to change any thing in the config file for a better result and I am using exactly same code as mentioned in Git repo
TIMESTEPS : 100
NATIVE_RESOLUTION : 'False' # if True, test with native resolution
DPM_SOLVER : 'False' # if True, test with DPM_solver
DPM_STEP : 20 # DPM_solver step
BATCH_SIZE_VAL : 1 # test batch size
TEST_PATH_GT : '/home/mepluser1/rahul_hanot/new_project_rahul/data/gt_png' # path of ground truth
TEST_PATH_IMG : '/home/mepluser1/rahul_hanot/new_project_rahul/data/input_data' # path of input
TEST_INITIAL_PREDICTOR_WEIGHT_PATH : '/home/mepluser1/rahul_hanot/DocDiff/model_weight/init_predictor_document_deblurring.pth' # path of initial predictor
TEST_DENOISER_WEIGHT_PATH : '/home/mepluser1/rahul_hanot/DocDiff/model_weight/denoiser_document_deblurring.pth' # path of denoiser
TEST_IMG_SAVE_PATH : './results_4' # path to save results

Royalvice · 2023-06-13T07:18:50Z

The inference notebook has been uploaded. If useful, please give it a star. Thank you.

Royalvice · 2023-06-13T11:28:30Z

Do I need to change any thing in the config file for a better result and I am using exactly same code as mentioned in Git repo TIMESTEPS : 100 NATIVE_RESOLUTION : 'False' # if True, test with native resolution DPM_SOLVER : 'False' # if True, test with DPM_solver DPM_STEP : 20 # DPM_solver step BATCH_SIZE_VAL : 1 # test batch size TEST_PATH_GT : '/home/mepluser1/rahul_hanot/new_project_rahul/data/gt_png' # path of ground truth TEST_PATH_IMG : '/home/mepluser1/rahul_hanot/new_project_rahul/data/input_data' # path of input TEST_INITIAL_PREDICTOR_WEIGHT_PATH : '/home/mepluser1/rahul_hanot/DocDiff/model_weight/init_predictor_document_deblurring.pth' # path of initial predictor TEST_DENOISER_WEIGHT_PATH : '/home/mepluser1/rahul_hanot/DocDiff/model_weight/denoiser_document_deblurring.pth' # path of denoiser TEST_IMG_SAVE_PATH : './results_4' # path to save results

No need change

rahulhanotDTU · 2023-06-13T11:33:39Z

when I am testing on your demo image it is working fine but when I test it on my image it shows very bad results can you please help in that and tell what is wrong with the process and how should i correct it

Royalvice · 2023-06-13T11:46:01Z

I am certain that the issue is due to the pre-trained weights I provided being trained on the deblurring dataset (http://www.fit.vutbr.cz/~ihradis/CNN-Deblur/). This dataset contains images with mostly pure white backgrounds and black text, which has a significant difference in pixel distribution compared to the test samples you provided (which have a grayish color). Furthermore, I did not perform any color data augmentation during training, leading to the results you provided.

I suggest two solutions:

If you have a large number of similar samples (over 500), fine-tune the pre-trained weights on them.
Perform color data augmentation on the deblurring dataset (http://www.fit.vutbr.cz/~ihradis/CNN-Deblur/) to change the grayscale distribution of the training samples.

Royalvice · 2023-06-13T11:54:24Z

when I am testing on your demo image it is working fine but when I test it on my image it shows very bad results can you please help in that and tell what is wrong with the process and how should i correct it

You can refer to the method mentioned in the ninth page of the paper "Convolutional Neural Networks for Direct Text Deblurring" (https://www.fit.vut.cz/research/publication-file/10922/hradis15CNNdeblurring.pdf) for real photo testing.

rahulhanotDTU · 2023-06-14T12:26:19Z

I want to resume my training from your given checkpoint can you please tell me the learning rate and config details I am currently using below-mentioned config

model

IMAGE_SIZE : [304, 304] # load image size, if it's train mode, it will be randomly cropped to IMAGE_SIZE. If it's test mode, it will be resized to IMAGE_SIZE.

CHANNEL_X : 3 # input channel
CHANNEL_Y : 3 # output channel
TIMESTEPS : 100 # diffusion steps
SCHEDULE : 'linear' # linear or cosine
MODEL_CHANNELS : 32 # basic channels of Unet
NUM_RESBLOCKS : 1 # number of residual blocks
CHANNEL_MULT : [1,2,3,4] # channel multiplier of each layer
NUM_HEADS : 1

MODE : 1 # 1 Train, 0 Test
PRE_ORI : 'True' # if True, predict $x_0$, else predict $\epsilon$.

train

PATH_GT : '' # path of ground truth
PATH_IMG : '' # path of input
BATCH_SIZE : 4 # training batch size
NUM_WORKERS : 2 # number of workers
ITERATION_MAX : 1000000 # max training iteration
LR : 0.0001 # learning rate
LOSS : 'L2' # L1 or L2
EMA_EVERY : 100 # update EMA every EMA_EVERY iterations
START_EMA : 2000 # start EMA after START_EMA iterations
SAVE_MODEL_EVERY : 10000 # save model every SAVE_MODEL_EVERY iterations
EMA: 'True' # if True, use EMA
CONTINUE_TRAINING : 'True' # if True, continue training
CONTINUE_TRAINING_STEPS : 10000 # continue training from CONTINUE_TRAINING_STEPS
PRETRAINED_PATH_INITIAL_PREDICTOR : '/home/mepluser1/rahul_hanot/try_new_1/DocDiff/checksave/init.pth' # path of pretrained initial predictor
PRETRAINED_PATH_DENOISER : '/home/mepluser1/rahul_hanot/try_new_1/DocDiff/checksave/denoiser.pth' # path of pretrained denoiser
WEIGHT_SAVE_PATH : './checksave' # path to save model
TRAINING_PATH : './Training' # path of training data
BETA_LOSS : 50 # hyperparameter to balance the pixel loss and the diffusion loss
HIGH_LOW_FREQ : 'True' # if True, training with frequency separation

Royalvice · 2023-06-14T14:46:12Z

I want to resume my training from your given checkpoint can you please tell me the learning rate and config details I am currently using below-mentioned config

model

IMAGE_SIZE : [304, 304] # load image size, if it's train mode, it will be randomly cropped to IMAGE_SIZE. If it's test mode, it will be resized to IMAGE_SIZE.

CHANNEL_X : 3 # input channel CHANNEL_Y : 3 # output channel TIMESTEPS : 100 # diffusion steps SCHEDULE : 'linear' # linear or cosine MODEL_CHANNELS : 32 # basic channels of Unet NUM_RESBLOCKS : 1 # number of residual blocks CHANNEL_MULT : [1,2,3,4] # channel multiplier of each layer NUM_HEADS : 1

MODE : 1 # 1 Train, 0 Test PRE_ORI : 'True' # if True, predict x0, else predict ϵ.

train

PATH_GT : '' # path of ground truth PATH_IMG : '' # path of input BATCH_SIZE : 4 # training batch size NUM_WORKERS : 2 # number of workers ITERATION_MAX : 1000000 # max training iteration LR : 0.0001 # learning rate LOSS : 'L2' # L1 or L2 EMA_EVERY : 100 # update EMA every EMA_EVERY iterations START_EMA : 2000 # start EMA after START_EMA iterations SAVE_MODEL_EVERY : 10000 # save model every SAVE_MODEL_EVERY iterations EMA: 'True' # if True, use EMA CONTINUE_TRAINING : 'True' # if True, continue training CONTINUE_TRAINING_STEPS : 10000 # continue training from CONTINUE_TRAINING_STEPS PRETRAINED_PATH_INITIAL_PREDICTOR : '/home/mepluser1/rahul_hanot/try_new_1/DocDiff/checksave/init.pth' # path of pretrained initial predictor PRETRAINED_PATH_DENOISER : '/home/mepluser1/rahul_hanot/try_new_1/DocDiff/checksave/denoiser.pth' # path of pretrained denoiser WEIGHT_SAVE_PATH : './checksave' # path to save model TRAINING_PATH : './Training' # path of training data BETA_LOSS : 50 # hyperparameter to balance the pixel loss and the diffusion loss HIGH_LOW_FREQ : 'True' # if True, training with frequency separation

The default config is suitable for document scenario. No need change. The training will be very stable.

rahulhanotDTU · 2023-06-14T15:23:14Z

should I change the LR from 0.0001 to 1e-5 or something lower?

Royalvice · 2023-06-14T15:27:10Z

should I change the LR from 0.0001 to 1e-5 or something lower?

1e-4 is ok. Lower means longer training time and performance will not be better.

rahulhanotDTU · 2023-06-15T09:17:41Z

I resumed the training with the above config with my custom data for the 281218 steps but I didn't get the required result so how long do I need to train the model

Royalvice · 2023-06-15T09:21:22Z

I resumed the training with the above config with my custom data for the 281218 steps but I didn't get the required result so how long do I need to train the model

Take a look at the output image of the training process

rahulhanotDTU · 2023-06-16T07:37:05Z

I trained the model for 7.5L but the results I am getting are uploaded below and the models weight is model_denoiser_450000.pth and model_init_450000.pth can you please tell why I am getting this type of result and how can improve them and your demo inference code and I am using

Royalvice · 2023-06-16T08:24:59Z

I trained the model for 7.5L but the results I am getting are uploaded below and the models weight is model_denoiser_450000.pth and model_init_450000.pth can you please tell why I am getting this type of result and how can improve them and your demo inference code and I am using

From the image you provided, this is not a simple deblurring task. The degraded document image you provided is not only blurry, but the strokes of the characters are also very faint. Therefore, your task is more difficult. My suggestion is to enhance the contrast of the input image to increase the intensity of the faint strokes.

Royalvice · 2023-06-16T08:26:27Z

I trained the model for 7.5L but the results I am getting are uploaded below and the models weight is model_denoiser_450000.pth and model_init_450000.pth can you please tell why I am getting this type of result and how can improve them and your demo inference code and I am using

It's best to train from scratch. If the results are still poor, you may need to design some additional modules to extract features.

rahulhanotDTU · 2023-06-16T15:44:25Z

I will try to train it from scratch but can you please help in writing a separate module or list of augmentation for these type of images so that model will predict them good

Royalvice · 2023-06-16T15:59:10Z

I will try to train it from scratch but can you please help in writing a separate module or list of augmentation for these type of images so that model will predict them good

I believe that it may be helpful for you to analyze the characteristics of your sample and consider designing additional modules to improve the overall outcome. This process typically involves a significant amount of trial and error in order to achieve the desired results.

rahulhanotDTU · 2023-07-11T07:22:27Z

I want to denoise the noise that is created by WhatsApp compression but I am unable to create that type of noise can you please help in creating this type of noise for text image

Royalvice · 2023-07-11T07:26:16Z

I want to denoise the noise that is created by WhatsApp compression but I am unable to create that type of noise can you please help in creating this type of noise for text image

This is salt-and-pepper noise, I might need to research how to synthesize it, but it's already on my to-do list.

rahulhanotDTU · 2023-07-11T07:30:41Z

I don't think so this is some kind of compression that was used by the WhatsApp to send the image can you please help urgently

Royalvice · 2023-07-11T07:33:33Z

I don't think so this is some kind of compression that was used by the WhatsApp to send the image can you please help urgently

Sure, I will do my best to assist you as soon as possible.

rahulhanotDTU · 2023-07-11T08:57:33Z

Any update regarding it

22chenR · 2024-01-07T14:29:48Z

请问你上传的inference notebook中怎么没有去噪和去水印的过程？当我把TEST_PATH_IMG 的路径改成带有水印的图像时，得到的结果是这样的。我想知道应该怎么办？

Royalvice closed this as completed Jun 10, 2023

Royalvice reopened this Jun 10, 2023

Royalvice closed this as completed Jun 13, 2023

Royalvice added the help wanted Extra attention is needed label Jun 13, 2023

Royalvice reopened this Jun 16, 2023

Royalvice closed this as completed Jun 16, 2023

Royalvice reopened this Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Issue #4

Inference Issue #4

rahulhanotDTU commented Jun 10, 2023

Royalvice commented Jun 10, 2023

rahulhanotDTU commented Jun 12, 2023 •

edited

Loading

Royalvice commented Jun 12, 2023

Royalvice commented Jun 12, 2023

rahulhanotDTU commented Jun 12, 2023

Royalvice commented Jun 12, 2023

rahulhanotDTU commented Jun 12, 2023

Royalvice commented Jun 13, 2023

Royalvice commented Jun 13, 2023

rahulhanotDTU commented Jun 13, 2023

Royalvice commented Jun 13, 2023

Royalvice commented Jun 13, 2023

rahulhanotDTU commented Jun 14, 2023

Royalvice commented Jun 14, 2023

model

train

rahulhanotDTU commented Jun 14, 2023

Royalvice commented Jun 14, 2023

rahulhanotDTU commented Jun 15, 2023

Royalvice commented Jun 15, 2023

rahulhanotDTU commented Jun 16, 2023

Royalvice commented Jun 16, 2023

Royalvice commented Jun 16, 2023

rahulhanotDTU commented Jun 16, 2023

Royalvice commented Jun 16, 2023

rahulhanotDTU commented Jul 11, 2023

Royalvice commented Jul 11, 2023

rahulhanotDTU commented Jul 11, 2023

Royalvice commented Jul 11, 2023

rahulhanotDTU commented Jul 11, 2023 •

edited

Loading

22chenR commented Jan 7, 2024

Inference Issue #4

Inference Issue #4

Comments

rahulhanotDTU commented Jun 10, 2023

Royalvice commented Jun 10, 2023

rahulhanotDTU commented Jun 12, 2023 • edited Loading

Royalvice commented Jun 12, 2023

Royalvice commented Jun 12, 2023

rahulhanotDTU commented Jun 12, 2023

Royalvice commented Jun 12, 2023

rahulhanotDTU commented Jun 12, 2023

Royalvice commented Jun 13, 2023

Royalvice commented Jun 13, 2023

rahulhanotDTU commented Jun 13, 2023

Royalvice commented Jun 13, 2023

Royalvice commented Jun 13, 2023

rahulhanotDTU commented Jun 14, 2023

model

train

Royalvice commented Jun 14, 2023

model

train

rahulhanotDTU commented Jun 14, 2023

Royalvice commented Jun 14, 2023

rahulhanotDTU commented Jun 15, 2023

Royalvice commented Jun 15, 2023

rahulhanotDTU commented Jun 16, 2023

Royalvice commented Jun 16, 2023

Royalvice commented Jun 16, 2023

rahulhanotDTU commented Jun 16, 2023

Royalvice commented Jun 16, 2023

rahulhanotDTU commented Jul 11, 2023

Royalvice commented Jul 11, 2023

rahulhanotDTU commented Jul 11, 2023

Royalvice commented Jul 11, 2023

rahulhanotDTU commented Jul 11, 2023 • edited Loading

22chenR commented Jan 7, 2024

rahulhanotDTU commented Jun 12, 2023 •

edited

Loading

rahulhanotDTU commented Jul 11, 2023 •

edited

Loading