Solution for stuck loops, parsing errors, and persistent <think> tag prompts

First, let me express the real respect for Stanford U developing so meaningful AI tool for bioinfo/med, I am looking forward for the next version which is similar as you show in the demo video.

I believe many people have this problem when you guys using Biomni, parsing error, stuck in loop, recently, I am also come across several time.
I would say, for this problem, basically, it is impossible to solve it with only several lines of codes according the background thought how they write the tool within BIomni.

On the website, left is the chatting history, right is the process where can show execution code, picture accurately, because they write a prompt, tell LLM every response should contain 'think' 'execute' 'response', then the tool will detect this several accurate word field, put like picutre/code/response on right place. (attention, the way they normalize their answers is to let LLM tweak the response)
In my opinion, there are two problems here.
1. LLM is LLM, sometimes they will not response follow your instruction 100%, even you write a prompt to remind the LLM every time.
2. The diff LLM has diff response pattern, what they use in the Biomni to write the prompt or train the R0(not sure) is chatgpt/claude, the response contains think word, when I use bigmodel-GLM 4.7, the response is reasoning, this will make Identifier tool confused，the standard response from GLM 4.7 is reasoning xxxx, but you give a prompt pls answer with 'think', the GLM 4.7 will still response with 'reasoning', the Identifier always can't detect the think word, then the total tasks will stuck in loop.
<img width="260" height="32" alt="Image" src="https://github.com/user-attachments/assets/f8b293ea-079d-4718-a506-4864c524491a" />

So, from my experience, I add several lines of codes to let the identifier being able to detect the 'reasoning' word.

Definitely, this is not the only problem which can cause loops error, for example, when you didn't install some package, they will also stuck in loop, so I add some codes like when you get similar error of without some packages, you should call API-LLM(distinguish with R0), ask for conda install command, then use conda list to check whether install successfully, if not, use pip install, then use pip show to check, then execute the recent task again.

Above is only when you use API LLM.
For me it works well currently.

Then today I jump to use hybrid model(API + R0), then the problem jump up again.
Frequently get stuck halfway through a task, I  check the logs, when some codes get error, the R0 LLM will tell himself try diff codes, but actually the R0 still use same code and get same error which showed in logs, that is really stupid. I think the basic problem is the R0 model is not as smart as some API-model like GLM-4.7, Claude-opus-4.5, R0 can not solve some problem by themself, Self-iteration.
So, currently, I am trying to add some rescue steps, if R0 stuck or try same codes several times(like stuck in loop), then ask API-model for help.

Just provide my experience for following guys who come across similar problem, in my opinion, when you change to use the diff LLM which response structure is different, then you need to modify the codes by yourself.

In my repo, there is a patched biomni working well for glm-4.7, I have no time to test for other models.

It will be fantastic If who can provide method to get something done once and for all!

May the Force be with you!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solution for stuck loops, parsing errors, and persistent <think> tag prompts #272

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Solution for stuck loops, parsing errors, and persistent <think> tag prompts #272

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions