Error about "Unexpected key(s) in state_dict: "w_junc", "w_heatmap" #85

GGshmily · 2023-07-11T03:11:31Z

Hi, I completed the step1 and got my trained model. During step2, I commented gt_source_train and gt_source_test and disable the photometric and homographic augmentations , but error occur:

python -m sold2.experiment --exp_name wireframe_train --mode export --resume_path experiments/sold2_synth/ --model_config sold2/config/train_detector.yaml --dataset_config sold2/config/wireframe_dataset.yaml --checkpoint_name checkpoint-epoch199-end.tar --export_dataset_mode train --export_batch_size 4
[Info] Export mode
Output path: ./datasets/export_datasets/wireframe_train
[Info] Export predictions with homography adaptation.
Initializing dataset and dataloader
[Info] Initializing wireframe dataset...
Found filename cache wireframe_train_cache.pkl at ./datasets/wireframe
Load filename cache...
[Info] Successfully initialized dataset
Name: wireframe
Mode: train
Gt: /media/cqw/KESU/SOLD2/datasets/synthetic_shapes/synthetic_shape_train.h5
Counts: 20000

     Successfully intialized dataset and dataloader.
    --------Initializing model----------
    Model architecture: simple
    Backbone: lcnn
    Junction decoder: superpoint_decoder
    Heatmap decoder: pixel_shuffle
    -------------------------------------

Traceback (most recent call last):
File "/media/cqw/KESU/SOLD2/sold2/export.py", line 23, in restore_weights
model.load_state_dict(state_dict)
File "/home/cqw/anaconda3/envs/SOLD2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SOLD2Net:
Unexpected key(s) in state_dict: "w_junc", "w_heatmap".

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cqw/anaconda3/envs/SOLD2/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/cqw/anaconda3/envs/SOLD2/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/media/cqw/KESU/SOLD2/sold2/experiment.py", line 227, in
export_dataset_mode=args.export_dataset_mode, device=device)
File "/media/cqw/KESU/SOLD2/sold2/experiment.py", line 116, in main
export(args, dataset_cfg, model_cfg, output_path, export_dataset_mode, device=device)
File "/media/cqw/KESU/SOLD2/sold2/experiment.py", line 92, in export
export_dataset_mode, device)
File "/media/cqw/KESU/SOLD2/sold2/export.py", line 158, in export_homograpy_adaptation
model = restore_weights(model, checkpoint["model_state_dict"])
File "/media/cqw/KESU/SOLD2/sold2/export.py", line 27, in restore_weights
missing_keys = err.missing_keys
AttributeError: 'NoneType' object has no attribute 'missing_keys'

I dont know how to solve it, can you give me some suggestion?

The text was updated successfully, but these errors were encountered:

rpautrat · 2023-07-13T11:27:50Z

Hi, when running Step 1, did you change any parameter in the config/train_detector.yaml file? In particular, did you keep the 'dynamic' policy for the junction and heatmap losses?

GGshmily · 2023-07-13T11:51:31Z

Hi, I didn't change any parameter in the config/train_detector.yaml file. Here is my parameter in the train_detector.yaml

[Model parameters]

model_name: "lcnn_simple"
model_architecture: "simple"

Backbone related config

backbone: "lcnn"
backbone_cfg:
input_channel: 1 # Use RGB images or grayscale images.
depth: 4
num_stacks: 2
num_blocks: 1
num_classes: 5

Junction decoder related config

junction_decoder: "superpoint_decoder"
junc_decoder_cfg:

Heatmap decoder related config

heatmap_decoder: "pixel_shuffle"
heatmap_decoder_cfg:

Shared configurations

grid_size: 8
keep_border_valid: True

Threshold of junction detection

detection_thresh: 0.0153846 # 1/65

Threshold of heatmap detection

prob_thresh: 0.5

[Loss parameters]

weighting_policy: "dynamic"

[Heatmap loss]

w_heatmap: 0.
w_heatmap_class: 1
heatmap_loss_func: "cross_entropy"
heatmap_loss_cfg:
policy: "dynamic"

[Junction loss]

w_junc: 0.
junction_loss_func: "superpoint"
junction_loss_cfg:
policy: "dynamic"

[Training parameters]

learning_rate: 0.0005
epochs: 200
train:
batch_size: 6
num_workers: 8
test:
batch_size: 6
num_workers: 8
disp_freq: 100
summary_freq: 200
max_ckpt: 150

It seems that the policy for the junction and heatmap losses are 'dynamic'

rpautrat · 2023-07-14T14:22:39Z

What is your torch version? It might just be a compatibility issue.

Note that you can probably solve this issue with a quick fix: replace the lines 22 to 36 of sold2/export.py by this line: model.load_state_dict(state_dict, strict=False). This could potentially fix your problem.

GGshmily · 2023-07-23T07:45:17Z

Hi, thank you for your reply. I retrained the model and finished the step 1. The model seems to work. But when running step 2, error occurs:
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/media/jimlee/65F33762C14D581B/SOLD2/sold2/dataset/wireframe_dataset.py", line 953, in getitem
exported_label = parse_h5_data(f[file_key])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/site-packages/h5py/_hl/group.py", line 305, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object '000000' doesn't exist)"

I used the wireframe datasets you provided and I check the './datasets/wireframe/train', there is no picture named '000000', can you give me some suggestions. Sorry to bother you.

rpautrat · 2023-07-25T09:26:17Z

Hi, did you keep the fields 'gt_source_train' and 'gt_source_test' commented in config/wireframe_dataset.yaml as requested in the ReadMe? This is necessary to export the pseudo ground truth.

GGshmily · 2023-07-30T02:24:36Z

Hi, thank you for your help. I finished step2. But when running step3, error occurs:

Traceback (most recent call last):
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/jimlee/anaconda3/envs/DeepLSD/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/media/jimlee/65F33762C14D581B/SOLD2/sold2/postprocess/convert_homography_results.py", line 122, in
junctions_pred_raw, heatmap_pred, device=device)
File "/media/jimlee/65F33762C14D581B/SOLD2/sold2/model/line_detection.py", line 101, in detect
self.heatmap_refine_cfg["valid_thresh"]
File "/media/jimlee/65F33762C14D581B/SOLD2/sold2/model/line_detection.py", line 262, in refine_heatmap_local
heatmap_output = torch.clamp((heatmap_output / count_map).float(),
RuntimeError: expected device cuda:0 and dtype Float but got device cuda:0 and dtype Int

My pytorch version is 1.4.0. Sorry to bother you once again, can you give me some suggestions.

rpautrat · 2023-07-31T07:20:30Z

Hi, I just pushed a small fix. Can you try again with the latest version of the code?

GGshmily closed this as completed Jul 13, 2023

GGshmily reopened this Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error about "Unexpected key(s) in state_dict: "w_junc", "w_heatmap" #85

Error about "Unexpected key(s) in state_dict: "w_junc", "w_heatmap" #85

GGshmily commented Jul 11, 2023

rpautrat commented Jul 13, 2023

GGshmily commented Jul 13, 2023

rpautrat commented Jul 14, 2023

GGshmily commented Jul 23, 2023

rpautrat commented Jul 25, 2023

GGshmily commented Jul 30, 2023

rpautrat commented Jul 31, 2023

Error about "Unexpected key(s) in state_dict: "w_junc", "w_heatmap" #85

Error about "Unexpected key(s) in state_dict: "w_junc", "w_heatmap" #85

Comments

GGshmily commented Jul 11, 2023

rpautrat commented Jul 13, 2023

GGshmily commented Jul 13, 2023

[Model parameters]

Backbone related config

Junction decoder related config

Heatmap decoder related config

Shared configurations

Threshold of junction detection

Threshold of heatmap detection

[Loss parameters]

[Heatmap loss]

[Junction loss]

[Training parameters]

rpautrat commented Jul 14, 2023

GGshmily commented Jul 23, 2023

rpautrat commented Jul 25, 2023

GGshmily commented Jul 30, 2023

rpautrat commented Jul 31, 2023