bert运行出现问题 #18

LLawlietc · 2020-04-09T11:17:59Z

Traceback (most recent call last):
not enough values to unpack (expected 2, got 1)
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 41, in load_data
word, tag = line.split()
ValueError: not enough values to unpack (expected 2, got 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 119, in
bert_data_util = BertDataUtils(tokenizer)
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 26, in init
self.load_data()
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 49, in load_data
inputs_ids = self.tokenizer.convert_tokens_to_ids(ntokens)
File "D:\dogtime\mission\BiGRU_crf\bert_base\bert\tokenization.py", line 179, in convert_tokens_to_ids
return convert_by_vocab(self.vocab, tokens)
File "D:\dogtime\mission\BiGRU_crf\bert_base\bert\tokenization.py", line 140, in convert_by_vocab
output.append(vocab[item])
KeyError: 'D'

LLawlietc · 2020-04-09T11:20:23Z

您好，作者，按照你的用法不使用bert没有问题，使用bert模型的时候会报上面的错误。debug的时候发现是跑到数据end那一行的时候出的问题，inputs_ids = self.tokenizer.convert_tokens_to_ids(ntokens)应该是这里不对，但是找不到哪里有问题。

yanwii · 2020-04-10T08:11:34Z

可以定位到
File "D:\dogtime\mission\BiGRU_crf\bert_base\bert\tokenization.py", line 140, in convert_by_vocab output.append(vocab[item]) KeyError: 'D'
可见错误是vocab没有D这个key，修改一下
vocab.get(item, UNK_TOKEN_INDEX)

LLawlietc · 2020-04-17T09:24:04Z

train data: 2757
nums of tags: 9
D:\Anaconda\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py:97: UserWarning: Converting sparse I
ndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-04-17 17:21:54.547394: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] You
r CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-04-17 17:21:54.680174: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1105]
Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2020-04-17 17:21:54.688171: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1195]
Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "model.py", line 503, in
model.train()
File "model.py", line 329, in train
ARGS.init_checkpoint)
File "D:\dogtime\mission\论文相关\资源\BiGRU_crf\bert_base\bert\modeling.py", line 330, in get_assignment_map_from_checkpoint
init_vars = tf.train.list_variables(init_checkpoint)
File "D:\Anaconda\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\training\checkpoint_utils.py", line 89, in list_va
riables
reader = load_checkpoint(ckpt_dir_or_file)
File "D:\Anaconda\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\training\checkpoint_utils.py", line 60, in load_ch
eckpoint
return pywrap_tensorflow.NewCheckpointReader(filename)
File "D:\Anaconda\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 225, in NewCh
eckpointReader
return CheckpointReader(compat.as_bytes(filepattern), status)
File "D:\Anaconda\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.DataLossError: file is too short to be an sstable

LLawlietc · 2020-04-17T09:27:24Z

您好，作者，按你所说修改后运行显示 file is too short to be an sstable。我的文件目录按你之前的issue创建如下，网上找了一下也没有很好的解决办法，感谢作者的帮助

aleien95 · 2020-07-27T09:18:40Z

您好，作者，按你所说修改后运行显示 file is too short to be an sstable。我的文件目录按你之前的issue创建如下，网上找了一下也没有很好的解决办法，感谢作者的帮助

兄弟这个bert的词典是没有区分大小写的，所以训练集的大写字母在vocab里面找不到哦。简单的解决办法就是你把数据集的大写改成小写就好了~

baiyewww · 2020-10-24T12:39:48Z

您好，作者，按你所说修改后运行显示 file is too short to be an sstable。我的文件目录按你之前的issue创建如下，网上找了一下也没有很好的解决办法，感谢作者的帮助

你好，请问你解决了运行的问题了嘛，我的运行也出现了一些问题，请问bert_base这个目录下是什么文件呢

hexieojie · 2020-12-03T07:22:54Z

你好，请问这个bert_base在哪里获取

Wuhu0 mentioned this issue Apr 25, 2022

pip install bert-base==0.0.7 -i https://pypi.python.org/simple #24

Closed

Repository owner deleted a comment Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bert运行出现问题 #18

bert运行出现问题 #18

LLawlietc commented Apr 9, 2020

LLawlietc commented Apr 9, 2020

yanwii commented Apr 10, 2020

LLawlietc commented Apr 17, 2020

LLawlietc commented Apr 17, 2020

aleien95 commented Jul 27, 2020

baiyewww commented Oct 24, 2020

hexieojie commented Dec 3, 2020

bert运行出现问题 #18

bert运行出现问题 #18

Comments

LLawlietc commented Apr 9, 2020

LLawlietc commented Apr 9, 2020

yanwii commented Apr 10, 2020

LLawlietc commented Apr 17, 2020

LLawlietc commented Apr 17, 2020

aleien95 commented Jul 27, 2020

baiyewww commented Oct 24, 2020

hexieojie commented Dec 3, 2020