Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue][Discussion]: Use of "name" field for messages #2989

Open
marklysze opened this issue Jun 21, 2024 · 2 comments
Open

[Issue][Discussion]: Use of "name" field for messages #2989

marklysze opened this issue Jun 21, 2024 · 2 comments
Labels
alt-models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.) dev development experience/productivity group chat group-chat-related issues robustness issues/pr related to robustness

Comments

@marklysze
Copy link
Collaborator

marklysze commented Jun 21, 2024

Describe the issue

This issue is a place to discuss the impact of not being able to rely on the name field on messages and existing, or proposed, solutions to cater for this.


The name field on messages (which contains the agent's name) is optionally supported by OpenAI for inference, however it's not used (or cannot be included) on messages through other clients (such as local inference or cloud inference through Anthropic, Mistral AI, Together.AI, Groq, etc.).

This creates a hidden challenge for AutoGen developers who may think that their messages have the agent's name associated with them and create workflows/prompts that refer to the name.

My experience with speaker selection on group chat is that without the name being available to the LLM, it can be difficult for the LLM to determine the next speaker when the prompt to select the next speaker is dependent on who has spoken already. For example, a speaker selection prompt of "Once A, E, or F have spoken select H to review and provide feedback." would be challenging to adhere to if each message has no name.

It would be great to get your thoughts on what not having name impacts (or doesn't) and ideas to solve it!


An initial thought is a simple way to, optionally, include the agent's name at the start of each message, e.g.

dm = ConversableAgent(
    "digital_marketer",
    "add_name_to_messages"=True,
    ...

# Resulting in messages where content is "digital_marketer said: \n Some ideas to ..."
)

(one challenge of this approach is that the LLM takes these messages as the format it needs to respond with and its responses all follow that format!)


Related issues/PRs:
#2635
#2457

Note: this issue relates to chat messages and not function/tool messages.

@marklysze marklysze added group chat group-chat-related issues robustness issues/pr related to robustness dev development experience/productivity alt-models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.) labels Jun 21, 2024
@yonitjio
Copy link

yonitjio commented Jun 21, 2024

Please keep the name field. Beside the agent selection, name field also useful for groupchat history i.e., pausing and resuming the group chat.

Also, for future proof, perhaps something like multi user chat. I have no idea what it's going to be, or how it will differ from groupchat. But having more data is good.

Sure there are other ways, but with name field, it's easier.

As for non openai, imo, if the name field is not used and/or causing issues, it should be handled in the client module logic.

@marklysze
Copy link
Collaborator Author

Please keep the name field. Beside the agent selection, name field also useful for groupchat history i.e., pausing and resuming the group chat.

Also, for future proof, perhaps something like multi user chat. I have no idea what it's going to be, or how it will differ from groupchat. But having more data is good.

Sure there are other ways, but with name field, it's easier.

As for non openai, imo, if the name field is not used and/or causing issues, it should be handled in the client module logic.

Thanks for your thoughts @yonitjio - agreed, the name field will stay in the core AutoGen code and I wouldn't think it will change messaging when using OpenAI, for now.

Yes, for non-OpenAI, it requires discussion and hopefully this issue will give us some ideas on that front.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alt-models Pertains to using alternate, non-GPT, models (e.g., local models, llama, etc.) dev development experience/productivity group chat group-chat-related issues robustness issues/pr related to robustness
Projects
None yet
Development

No branches or pull requests

2 participants