Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when running test.py with the provided checkpoint file #37

Open
XuchenWang opened this issue Aug 23, 2020 · 6 comments
Open

Issue when running test.py with the provided checkpoint file #37

XuchenWang opened this issue Aug 23, 2020 · 6 comments

Comments

@XuchenWang
Copy link

XuchenWang commented Aug 23, 2020

I downloaded the pre-trained checkpoint from the google drive and changed 'init_from' to the address of .ckpt file. However, when I tried to run test.py, the got the following issue:

Restoring model from /data/congzou/xuchenwang/BidirectionalAttensiveFusion/final_results/model.ckpt Traceback (most recent call last): File "test.py", line 280, in <module> test(options) File "test.py", line 80, in test saver.restore(sess, options['init_from']) File "/data/congzou/xuchenwang/venv2.7t/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1428, in restore {self.saver_def.filename_tensor_name: save_path}) File "/data/congzou/xuchenwang/venv2.7t/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run run_metadata_ptr) File "/data/congzou/xuchenwang/venv2.7t/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run feed_dict_string, options, run_metadata) File "/data/congzou/xuchenwang/venv2.7t/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run target_list, options, run_metadata) File "/data/congzou/xuchenwang/venv2.7t/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl. NotFoundError: Tensor name "caption_module/multi_rnn_cell/cell_1/lstm_cell/biases" not found in checkpoint files /data/congzou/xuchenwang/BidirectionalAttensiveFusion/final_results/model.ckpt [[Node: save/RestoreV2_12 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_12/tensor_names, save/RestoreV2_12/shape_and_slices)]]

I notice that the .ckpt file does contain those kay values, so I wander what is going on with this bug? If possible, could you please provide a guide about loading pre-trained model?

@KPKWCE
Copy link

KPKWCE commented Sep 1, 2020

Dear xuchenwang,

Can pls tell me from which link you have downloaded pre-trained checkpoint from the google drive

@XuchenWang
Copy link
Author

Hi,

Here is the link I downloaded the checkpoint, https://drive.google.com/drive/folders/1qeH5r5XEabkcQDJ25unSCvEUziRleN80

@KPKWCE
Copy link

KPKWCE commented Sep 2, 2020

Hi,

Here is the link I downloaded the checkpoint, https://drive.google.com/drive/folders/1qeH5r5XEabkcQDJ25unSCvEUziRleN80

Hey thank you so much XuchechWang,

I downloaded these file and added to checkpoints folder but still, I am getting an error

I am trying to execute test.py but getting the following error.

Restoring model from
ValueError Traceback (most recent call last)
in
3 tf.reset_default_graph()
4 options = default_options()
----> 5 test(options)

in test(options)
20 print('Restoring model from %s'%options['init_from'])
21 saver = tf.train.Saver()
---> 22 saver.restore(sess, options['init_from'])
23
24 word2ix = options['vocab']

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py in restore(self, sess, save_path)
1280 if not checkpoint_management.checkpoint_exists_internal(checkpoint_prefix):
1281 raise ValueError("The passed save_path is not a valid checkpoint: " +
-> 1282 checkpoint_prefix)
1283
1284 logging.info("Restoring parameters from %s", checkpoint_prefix)

ValueError: The passed save_path is not a valid checkpoint:

Could you please help me on these issues?
Thanks for your time and great work!

@XuchenWang
Copy link
Author

Hi KPKWCE,

Which save_path did you pass into the saver? My test.py also failed at this line, but with another bug, so I'm not sure if I can solve your problem.

@KPKWCE
Copy link

KPKWCE commented Sep 8, 2020

Hello XuchenWang,
Actually I'm directly trying to run test.py , but in the checkpoint folder, there was no any epoch file so getting an error.
So first i trained with train.py but only generated the following file No epoch file.
status and event file....
Pls help for this why epoch file is not genrated

@galrapo
Copy link

galrapo commented Sep 26, 2020

Hi XuchenWang,

Same here!

I get the following error:
NotFoundError (see above for traceback): Tensor name "caption_module/multi_rnn_cell/cell_1/lstm_cell/weights" not found in checkpoint files checkpoints/model.ckpt

Any chance you can help with resolving it?

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants