-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is there repetitive content in the prompt #56
Comments
This is an interesting topic. This prompt is directly copied from the Microsoft/graphrag repository(https://github.com/microsoft/graphrag/blob/16b4ea5dc9c3c74ec6d97b1551db4631b40949ff/graphrag/query/structured_search/local_search/system_prompt.py#L6-L69). I don't think this is a mistake, as repeating the instruction twice may indeed lead to performance improvements (https://arxiv.org/abs/2402.15449). I think this might have introduced a certain degree of 'bidirectionality' in the expression. Although the paper I provided offers some guidance on embeddings, the concept of repeating twice has already been discussed somewhat in general QA. (I forget the paper lol) |
@rangehow thanks for your sharing, If repeating requests in a prompt can indeed enhance the model's understanding and generation capabilities, it is indeed quite remarkable. However in the long run, if the model's capabilities are sufficiently advanced, we may not need such techniques (after all, it seems a bit odd and also consumes extra tokens). |
If you could try this comparison with a small model on some QA benchmarks and obtain quantitative results, that would be very helpful. Otherwise, we might temporarily continue to behave the same way as Microsoft. Once it is confirmed that there are no indeed benefits, you are more than welcome to submit a PR to the repository to modify it. 🤗 |
For example the Goal and Target response length and format repeated.
Is this intentional repetition or some kind of error?
PROMPTS[
"local_rag_response"
] = """---Role---
You are a helpful assistant responding to questions about data in the tables provided.
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
---Data tables---
{context_data}
---Goal---
Generate a response of the target length and format that responds to the user's question, summarizing all information in the input data tables appropriate for the response length and format, and incorporating any relevant general knowledge.
If you don't know the answer, just say so. Do not make anything up.
Do not include information where the supporting evidence for it is not provided.
---Target response length and format---
{response_type}
Add sections and commentary to the response as appropriate for the length and format. Style the response in markdown.
"""
The text was updated successfully, but these errors were encountered: