-
I am using a Pandas pipeline to query my dataframe. However, I would like to provide context about my data to Llamaindex so that it can help me write responses in a meaningful way. For example, my data column name is 'EmpID', but I want it to be referred to as 'Employee Number' in the response instead of 'EmpID'. How and where can I provide context about all my data attributes as well as an understanding of my data? Should it be in the response_prompt? Could you provide me with a sample code, please? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
To provide context about your pandas dataset to Llamaindex so that it can refer to your data attributes in a meaningful way, you can customize the from llama_index.core.prompts import PromptTemplate, PromptType
############################################
# Pandas
############################################
CUSTOM_PANDAS_TMPL = (
"You are working with a pandas dataframe in Python.\n"
"The name of the dataframe is `df`.\n"
"This is the result of `print(df.head())`:\n"
"{df_str}\n\n"
"Please note the following context about the dataframe columns:\n"
"'EmpID' should be referred to as 'Employee Number'.\n"
"Follow these instructions:\n"
"{instruction_str}\n"
"Query: {query_str}\n\n"
"Expression:"
)
CUSTOM_PANDAS_PROMPT = PromptTemplate(
CUSTOM_PANDAS_TMPL, prompt_type=PromptType.PANDAS
) By including the line |
Beta Was this translation helpful? Give feedback.
To provide context about your pandas dataset to Llamaindex so that it can refer to your data attributes in a meaningful way, you can customize the
DEFAULT_PANDAS_TMPL
template. Here is an example of how you can modify the template to refer to 'EmpID' as 'Employee Number':