Help with workaround for issue 581, ChatML chat format causing assertion error #585

chris-cortner · 2024-01-11T17:11:01Z

chris-cortner
Jan 11, 2024

My project is currently blocked on this. I haven't found a way to use this prompt format. Can someone help me with a workaround in the short term?

Thanks!

Issue #581

Copying the text here:

I'm trying to apply dolphin mistral's prompt template format:

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{user_prompt}<|im_end|>
<|im_start|>assistant

I've tried this a couple of different ways:

quant_path = "TheBloke/dolphin-2.6-mistral-7B-AWQ"
lm = models.Transformers(quant_path, device_map="auto")
stop_char = '"'
prompt_template = '<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n'

lm2 = lm + (prompt_template.format(system_prompt="You are a helpful AI", prompt="What is the distance to mars?")

f'The distance to mars is "{gen("answer", max_tokens=500, stop=stop_char, temperature=0.7)}"')

And by using TransformersChat:

quant_path = "TheBloke/dolphin-2.6-mistral-7B-AWQ"
lm = models.TransformersChat(quant_path, device_map="auto")
stop_char = '"'

with system():
lm2 = lm + "You are a helpful AI"

with user():
lm2 += "What is the distance to mars?"

with assistant():
lm2 += 'The distance to mars is "' + gen("answer", max_tokens=500, stop=stop_char, temperature=0.8)

Both method produce the same error:

An assertion error is thrown in _cleanup_tokens in _model.py

Traceback (most recent call last):
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 309, in add
out = lm + partial_grammar

File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 317, in add
out = lm._run_stateless(value)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 482, in _run_stateless
for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 798, in call
token_ids,token_byte_positions = self._cleanup_tokens(token_ids, token_byte_positions)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 628, in _cleanup_tokens
assert token_byte_positions[-1] == last_pos
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Seems to work fine with neural-chat, which is Mistral based but uses a different system prompt format:

"### System:\n{system_prompt}\n\n### User:\n{prompt}\n\n### Assistant:\n"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with workaround for issue 581, ChatML chat format causing assertion error #585

{{title}}

Replies: 0 comments

Select a reply

Help with workaround for issue 581, ChatML chat format causing assertion error #585

chris-cortner Jan 11, 2024

Replies: 0 comments

chris-cortner
Jan 11, 2024