You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of the bug.
How To Reproduce the bug
Steps to reproduce the behavior, how frequent can you experience the bug:
run az login through command line
create groundtruth data using this framework's utilities
start a local chatbot
run evaluation using this framework's default utils
Expected behavior
I would expect that the basic run functionality would produce results of an evaluation of the chatbot functionality based on the metrics requested. The evaluator starts and passes both tests which suggests the chatbot and GPT endpoints are accessible, but once the promptyflow call is used to make the metrics.call -- this is when things trip up and suggests that the credentials are not properly inherited.
Screenshots
I'm submitting a whole copy paste of the output as there is more than a screenshot will handle:
[19:26:20] INFO Running evaluation from config c:\tes\Use Case evaluate.py:227
46\ragEval\ai-rag-chat-evaluator-main\example_config.json
INFO Replaced results_dir in config with timestamp evaluate.py:215
INFO Using Azure OpenAI Service with Azure Developer CLI Credential service_setup.py:32
INFO Running evaluation using data from c:\tes\Use Case evaluate.py:91
46\ragEval\example_input\qa_jdTest.jsonl
INFO Limiting evaluation to 20 questions evaluate.py:94
INFO Sending a test question to the target to ensure it is running... evaluate.py:97
[19:26:26] INFO Successfully received response from target for question: "What information is evaluate.py:109
in your knowledge base?"
"answer": "The information in my knowledge base includes:
1...."
"context": "2023q3pillar3.pdf#page=4: Royal Bank of Canada (RB..."
INFO Sending a test chat completion to the GPT deployment to ensure it is running... evaluate.py:120
[19:26:28] INFO Successfully received response from GPT: "Hello! How can I help you today?" evaluate.py:127
INFO Starting evaluation... evaluate.py:130
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:65, in EvaluatorBase.call(self, **kwargs)
54 def call(self, **kwargs) -> Dict:
55 """Evaluate a given input. This method serves as a wrapper and is meant to be overridden by child classes for
56 one main reason - to overwrite the method headers and docstring to include additional inputs as needed.
57 The actual behavior of this function shouldn't change beyond adding more inputs to the
(...)
63 :rtype: Dict
64 """
---> 65 return async_run_allowing_running_loop(self._async_evaluator, **kwargs)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop(async_func, *args, **kwargs)
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))
File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures_base.py:401, in Future.__get_result(self)
399 if self._exception:
400 try:
--> 401 raise self._exception
402 finally:
403 # Break a reference cycle with the exception in self._exception
404 self = None
File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures\thread.py:58, in _WorkItem.run(self)
55 return
57 try:
---> 58 result = self.fn(*self.args, **self.kwargs)
59 except BaseException as exc:
60 self.future.set_exception(exc)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop..()
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))
File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\runners.py:190, in run(main, debug)
186 raise RuntimeError(
187 "asyncio.run() cannot be called from a running event loop")
189 with Runner(debug=debug) as runner:
--> 190 return runner.run(main)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\base_events.py:653, in BaseEventLoop.run_until_complete(self, future)
650 if not future.done():
651 raise RuntimeError('Event loop stopped before Future completed.')
--> 653 return future.result()
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:302, in AsyncEvaluatorBase.call(self, query, response, context, conversation, **kwargs)
300 if context is not None:
301 kwargs["context"] = context
--> 302 return await self._real_call(**kwargs)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:263, in EvaluatorBase._real_call(self, **kwargs)
261 # Evaluate all inputs.
262 for eval_input in eval_input_list:
--> 263 per_turn_results.append(await self._do_eval(eval_input))
264 # Return results as-is if only one result was produced.
266 if len(per_turn_results) == 1:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_prompty_eval.py:72, in PromptyEvaluatorBase._do_eval(self, eval_input)
61 @OverRide
62 async def _do_eval(self, eval_input: Dict) -> Dict:
63 """Do a relevance evaluation.
64
65 :param eval_input: The input to the evaluator. Expected to contain
(...)
70 :rtype: Dict
71 """
---> 72 llm_output = await self._flow(timeout=self.LLM_CALL_TIMEOUT, **eval_input)
74 score = np.nan
75 if llm_output:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_prompty_utils.py:1219, in handle_openai_error_async..decorator..wrapper(*args, **kwargs)
1217 status_code = e.response.status_code
1218 if status_code < 500 and status_code not in [429, 422]:
-> 1219 raise WrappedOpenAIError(e)
1220 if isinstance(e, RateLimitError) and getattr(e, "type", None) == "insufficient_quota":
1221 # Exit retry if this is quota insufficient error
1222 logger.error(f"{type(e).name} with insufficient quota. Throw user error.")
Running Information(please complete the following information):
Promptflow Package Version: 1.16.1
Operating System: Windows (Windows Server 2019 Datacenter)
Python Version using python --version: Python 3.11.4
Additional context
I am working from a context where API KEY is not used, my company does not share the API KEY but instead uses a roles based access using the Azure CLI / EntraID credentials and requires that tokens be captured from a call to azure cognitive services. The following is a minimal credentials instantiation script we run in our chatbot
# Use the current user identity to authenticate with Azure OpenAI, AI Search and Blob Storage (no secrets needed,
# just use 'az login' locally, and managed identity when deployed on Azure). If you need to use keys, use separate AzureKeyCredential instances with the
# keys for each service
# If you encounter a blocking error during a DefaultAzureCredential resolution, you can exclude the problematic credential by using a parameter (ex. exclude_shared_token_cache_credential=True)
credential_chain = (
# Azure CLI as default to test on VM so that the managed identity is not picked up
AzureCliCredential(),
# Try Managed Identity second, for the Web App.
ManagedIdentityCredential()
)
azure_credential = ChainedTokenCredential(*credential_chain)
token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default")
# Store on app.config for later use inside requests
openai_client = AsyncAzureOpenAI(
api_version="2024-03-01-preview",#
azure_endpoint=f"https://{AZURE_OPENAI_SERVICE}.openai.azure.com",
azure_ad_token_provider=token_provider,
)
The text was updated successfully, but these errors were encountered:
Describe the bug
A clear and concise description of the bug.
How To Reproduce the bug
Steps to reproduce the behavior, how frequent can you experience the bug:
az login
through command lineExpected behavior
I would expect that the basic run functionality would produce results of an evaluation of the chatbot functionality based on the metrics requested. The evaluator starts and passes both tests which suggests the chatbot and GPT endpoints are accessible, but once the promptyflow call is used to make the metrics.call -- this is when things trip up and suggests that the credentials are not properly inherited.
Screenshots
I'm submitting a whole copy paste of the output as there is more than a screenshot will handle:
[19:26:20] INFO Running evaluation from config c:\tes\Use Case evaluate.py:227
46\ragEval\ai-rag-chat-evaluator-main\example_config.json
INFO Replaced results_dir in config with timestamp evaluate.py:215
INFO Using Azure OpenAI Service with Azure Developer CLI Credential service_setup.py:32
INFO Running evaluation using data from c:\tes\Use Case evaluate.py:91
46\ragEval\example_input\qa_jdTest.jsonl
INFO Limiting evaluation to 20 questions evaluate.py:94
INFO Sending a test question to the target to ensure it is running... evaluate.py:97
[19:26:26] INFO Successfully received response from target for question: "What information is evaluate.py:109
in your knowledge base?"
"answer": "The information in my knowledge base includes:
[19:26:28] INFO Successfully received response from GPT: "Hello! How can I help you today?" evaluate.py:127
INFO Starting evaluation... evaluate.py:130
AuthenticationError Traceback (most recent call last)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_prompty_utils.py:1191, in handle_openai_error_async..decorator..wrapper(*args, **kwargs)
1190 try:
-> 1191 return await func(*args, **kwargs)
1192 except (SystemErrorException, UserErrorException) as e:
1193 # Throw inner wrapped exception directly
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_flow.py:562, in AsyncPrompty.call(self, *args, **kwargs)
561 timeout = kwargs.get("timeout", None)
--> 562 response = await send_request_to_llm(api_client, self._model.api, params, timeout)
563 return format_llm_response(
564 response=response,
565 api=self._model.api,
(...)
569 outputs=self._outputs,
570 )
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai\resources\chat\completions.py:1633, in AsyncCompletions.create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
1632 validate_response_format(response_format)
-> 1633 return await self._post(
1634 "/chat/completions",
1635 body=await async_maybe_transform(
1636 {
1637 "messages": messages,
1638 "model": model,
1639 "audio": audio,
1640 "frequency_penalty": frequency_penalty,
1641 "function_call": function_call,
1642 "functions": functions,
1643 "logit_bias": logit_bias,
1644 "logprobs": logprobs,
1645 "max_completion_tokens": max_completion_tokens,
1646 "max_tokens": max_tokens,
1647 "metadata": metadata,
1648 "modalities": modalities,
1649 "n": n,
1650 "parallel_tool_calls": parallel_tool_calls,
1651 "presence_penalty": presence_penalty,
1652 "response_format": response_format,
1653 "seed": seed,
1654 "service_tier": service_tier,
1655 "stop": stop,
1656 "store": store,
1657 "stream": stream,
1658 "stream_options": stream_options,
1659 "temperature": temperature,
1660 "tool_choice": tool_choice,
1661 "tools": tools,
1662 "top_logprobs": top_logprobs,
1663 "top_p": top_p,
1664 "user": user,
1665 },
1666 completion_create_params.CompletionCreateParams,
1667 ),
1668 options=make_request_options(
1669 extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
1670 ),
1671 cast_to=ChatCompletion,
1672 stream=stream or False,
1673 stream_cls=AsyncStream[ChatCompletionChunk],
1674 )
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1838, in AsyncAPIClient.post(self, path, cast_to, body, files, options, stream, stream_cls)
1835 opts = FinalRequestOptions.construct(
1836 method="post", url=path, json_data=body, files=await async_to_httpx_files(files), **options
1837 )
-> 1838 return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1532, in AsyncAPIClient.request(self, cast_to, options, stream, stream_cls, remaining_retries)
1530 retries_taken = 0
-> 1532 return await self._request(
1533 cast_to=cast_to,
1534 options=options,
1535 stream=stream,
1536 stream_cls=stream_cls,
1537 retries_taken=retries_taken,
1538 )
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1633, in AsyncAPIClient._request(self, cast_to, options, stream, stream_cls, retries_taken)
1632 log.debug("Re-raising status error")
-> 1633 raise self._make_status_error_from_response(err.response) from None
1635 return await self._process_response(
1636 cast_to=cast_to,
1637 options=options,
(...)
1641 retries_taken=retries_taken,
1642 )
AuthenticationError: Error code: 401 - {'error': {'code': 'PermissionDenied', 'message': 'Principal does not have access to API/Operation.'}}
During handling of the above exception, another exception occurred:
WrappedOpenAIError Traceback (most recent call last)
Cell In[37], line 2
1 # Where the config_path is relative to working_dir
----> 2 msftEval.evaluate.run_evaluate_from_config(working_dir = os.getcwd(),
3 config_path = "./ai-rag-chat-evaluator-main/example_config.json",
4 num_questions = 20,
5 target_url = "http://127.0.0.1:50505/chat")
File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:238, in run_evaluate_from_config(working_dir, config_path, num_questions, target_url)
232 results_dir = working_dir / Path(config["results_dir"])
234 # From Dev to trouleshoot
235 # print(config)
236 # print("service Setup looks like", service_setup.get_openai_config())
--> 238 evaluation_run_complete = run_evaluation(
239 openai_config=service_setup.get_openai_config(),
240 testdata_path=working_dir / Path(config["testdata_path"]),
241 results_dir=results_dir,
242 target_url=target_url or config["target_url"],
243 target_parameters=config.get("target_parameters", {}),
244 num_questions=num_questions,
245 requested_metrics=config.get(
246 "requested_metrics",
247 ["gpt_groundedness", "gpt_relevance",
248 "gpt_coherence", "answer_length", "latency"],
249 ),
250 target_response_answer_jmespath=config.get(
251 "target_response_answer_jmespath"),
252 target_response_context_jmespath=config.get(
253 "target_response_context_jmespath"),
254 )
256 if evaluation_run_complete:
257 results_config_path = results_dir / "config.json"
File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:170, in run_evaluation(openai_config, testdata_path, results_dir, target_url, target_parameters, requested_metrics, num_questions, target_response_answer_jmespath, target_response_context_jmespath)
168 questions_with_ratings = []
169 for row in track(testdata, description="Processing..."):
--> 170 questions_with_ratings.append(evaluate_row(row))
172 logger.info(
173 "Evaluation calls have completed. Calculating overall metrics now...")
174 # Make the results directory if it doesn't exist
File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:157, in run_evaluation..evaluate_row(row)
155 print(openai_config)
156 print(help(metric.evaluator_fn))
--> 157 result = metric.evaluator_fn(openai_config=openai_config)(
158 query=row["question"],
159 response=output["answer"],
160 context=output["context"],
161 ground_truth=row["truth"],
162 )
163 output.update(result)
165 return output
File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\azureAi_evaluation_evaluators_groundedness_groundedness.py:82, in GroundednessEvaluator.call(self, response, context, conversation, **kwargs)
75 print(dir(self._flow))
76 # print(self._model)
77 # print(self._name)
78 # print(self._parse_prompty)
79 # print(f"response == {response}")
80 # print(f"context == {context}")
81 # print(f"conversation == {conversation}")
---> 82 tmpReturn = super().call(response=response, context=context,
83 conversation=conversation, **kwargs)
84 print("Post call, now return")
85 return tmpReturn
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:65, in EvaluatorBase.call(self, **kwargs)
54 def call(self, **kwargs) -> Dict:
55 """Evaluate a given input. This method serves as a wrapper and is meant to be overridden by child classes for
56 one main reason - to overwrite the method headers and docstring to include additional inputs as needed.
57 The actual behavior of this function shouldn't change beyond adding more inputs to the
(...)
63 :rtype: Dict
64 """
---> 65 return async_run_allowing_running_loop(self._async_evaluator, **kwargs)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop(async_func, *args, **kwargs)
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))
File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures_base.py:456, in Future.result(self, timeout)
454 raise CancelledError()
455 elif self._state == FINISHED:
--> 456 return self.__get_result()
457 else:
458 raise TimeoutError()
File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures_base.py:401, in Future.__get_result(self)
399 if self._exception:
400 try:
--> 401 raise self._exception
402 finally:
403 # Break a reference cycle with the exception in self._exception
404 self = None
File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures\thread.py:58, in _WorkItem.run(self)
55 return
57 try:
---> 58 result = self.fn(*self.args, **self.kwargs)
59 except BaseException as exc:
60 self.future.set_exception(exc)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop..()
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))
File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\runners.py:190, in run(main, debug)
186 raise RuntimeError(
187 "asyncio.run() cannot be called from a running event loop")
189 with Runner(debug=debug) as runner:
--> 190 return runner.run(main)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\runners.py:118, in Runner.run(self, coro, context)
116 self._interrupt_count = 0
117 try:
--> 118 return self._loop.run_until_complete(task)
119 except exceptions.CancelledError:
120 if self._interrupt_count > 0:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\base_events.py:653, in BaseEventLoop.run_until_complete(self, future)
650 if not future.done():
651 raise RuntimeError('Event loop stopped before Future completed.')
--> 653 return future.result()
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:302, in AsyncEvaluatorBase.call(self, query, response, context, conversation, **kwargs)
300 if context is not None:
301 kwargs["context"] = context
--> 302 return await self._real_call(**kwargs)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:263, in EvaluatorBase._real_call(self, **kwargs)
261 # Evaluate all inputs.
262 for eval_input in eval_input_list:
--> 263 per_turn_results.append(await self._do_eval(eval_input))
264 # Return results as-is if only one result was produced.
266 if len(per_turn_results) == 1:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_prompty_eval.py:72, in PromptyEvaluatorBase._do_eval(self, eval_input)
61 @OverRide
62 async def _do_eval(self, eval_input: Dict) -> Dict:
63 """Do a relevance evaluation.
64
65 :param eval_input: The input to the evaluator. Expected to contain
(...)
70 :rtype: Dict
71 """
---> 72 llm_output = await self._flow(timeout=self.LLM_CALL_TIMEOUT, **eval_input)
74 score = np.nan
75 if llm_output:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\tracing_trace.py:488, in _traced_async..wrapped(*args, **kwargs)
486 Tracer.push(trace)
487 enrich_span_with_input(span, trace.inputs)
--> 488 output = await func(*args, **kwargs)
489 output = handle_output(span, trace.inputs, output, trace_type)
490 except Exception as e:
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_prompty_utils.py:1219, in handle_openai_error_async..decorator..wrapper(*args, **kwargs)
1217 status_code = e.response.status_code
1218 if status_code < 500 and status_code not in [429, 422]:
-> 1219 raise WrappedOpenAIError(e)
1220 if isinstance(e, RateLimitError) and getattr(e, "type", None) == "insufficient_quota":
1221 # Exit retry if this is quota insufficient error
1222 logger.error(f"{type(e).name} with insufficient quota. Throw user error.")
WrappedOpenAIError: OpenAI API hits AuthenticationError: Principal does not have access to API/Operation. If you are using azure openai connection, please make sure you have proper role assignment on your azure openai resource. You can refer to https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control
Running Information(please complete the following information):
python --version
: Python 3.11.4Additional context
I am working from a context where API KEY is not used, my company does not share the API KEY but instead uses a roles based access using the Azure CLI / EntraID credentials and requires that tokens be captured from a call to azure cognitive services. The following is a minimal credentials instantiation script we run in our chatbot
azure_credential = DefaultAzureCredential(exclude_shared_token_cache_credential=True)
token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default")
# Store on app.config for later use inside requests
openai_client = AsyncAzureOpenAI(
api_version="2024-03-01-preview",#
azure_endpoint=f"https://{AZURE_OPENAI_SERVICE}.openai.azure.com",
azure_ad_token_provider=token_provider,
)
The text was updated successfully, but these errors were encountered: