Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] #3855

Open
JonathanDench opened this issue Nov 14, 2024 · 0 comments
Open

[BUG] #3855

JonathanDench opened this issue Nov 14, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@JonathanDench
Copy link

Describe the bug
A clear and concise description of the bug.

How To Reproduce the bug
Steps to reproduce the behavior, how frequent can you experience the bug:

  1. run az login through command line
  2. create groundtruth data using this framework's utilities
  3. start a local chatbot
  4. run evaluation using this framework's default utils

Expected behavior
I would expect that the basic run functionality would produce results of an evaluation of the chatbot functionality based on the metrics requested. The evaluator starts and passes both tests which suggests the chatbot and GPT endpoints are accessible, but once the promptyflow call is used to make the metrics.call -- this is when things trip up and suggests that the credentials are not properly inherited.

Screenshots
I'm submitting a whole copy paste of the output as there is more than a screenshot will handle:

[19:26:20] INFO Running evaluation from config c:\tes\Use Case evaluate.py:227
46\ragEval\ai-rag-chat-evaluator-main\example_config.json
INFO Replaced results_dir in config with timestamp evaluate.py:215
INFO Using Azure OpenAI Service with Azure Developer CLI Credential service_setup.py:32
INFO Running evaluation using data from c:\tes\Use Case evaluate.py:91
46\ragEval\example_input\qa_jdTest.jsonl
INFO Limiting evaluation to 20 questions evaluate.py:94
INFO Sending a test question to the target to ensure it is running... evaluate.py:97
[19:26:26] INFO Successfully received response from target for question: "What information is evaluate.py:109
in your knowledge base?"
"answer": "The information in my knowledge base includes:

                1...."                                                                                         
                "context": "2023q3pillar3.pdf#page=4: Royal Bank of Canada (RB..."                             
       INFO     Sending a test chat completion to the GPT deployment to ensure it is running... evaluate.py:120

[19:26:28] INFO Successfully received response from GPT: "Hello! How can I help you today?" evaluate.py:127
INFO Starting evaluation... evaluate.py:130


AuthenticationError Traceback (most recent call last)
File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_prompty_utils.py:1191, in handle_openai_error_async..decorator..wrapper(*args, **kwargs)
1190 try:
-> 1191 return await func(*args, **kwargs)
1192 except (SystemErrorException, UserErrorException) as e:
1193 # Throw inner wrapped exception directly

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_flow.py:562, in AsyncPrompty.call(self, *args, **kwargs)
561 timeout = kwargs.get("timeout", None)
--> 562 response = await send_request_to_llm(api_client, self._model.api, params, timeout)
563 return format_llm_response(
564 response=response,
565 api=self._model.api,
(...)
569 outputs=self._outputs,
570 )

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai\resources\chat\completions.py:1633, in AsyncCompletions.create(self, messages, model, audio, frequency_penalty, function_call, functions, logit_bias, logprobs, max_completion_tokens, max_tokens, metadata, modalities, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, store, stream, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
1632 validate_response_format(response_format)
-> 1633 return await self._post(
1634 "/chat/completions",
1635 body=await async_maybe_transform(
1636 {
1637 "messages": messages,
1638 "model": model,
1639 "audio": audio,
1640 "frequency_penalty": frequency_penalty,
1641 "function_call": function_call,
1642 "functions": functions,
1643 "logit_bias": logit_bias,
1644 "logprobs": logprobs,
1645 "max_completion_tokens": max_completion_tokens,
1646 "max_tokens": max_tokens,
1647 "metadata": metadata,
1648 "modalities": modalities,
1649 "n": n,
1650 "parallel_tool_calls": parallel_tool_calls,
1651 "presence_penalty": presence_penalty,
1652 "response_format": response_format,
1653 "seed": seed,
1654 "service_tier": service_tier,
1655 "stop": stop,
1656 "store": store,
1657 "stream": stream,
1658 "stream_options": stream_options,
1659 "temperature": temperature,
1660 "tool_choice": tool_choice,
1661 "tools": tools,
1662 "top_logprobs": top_logprobs,
1663 "top_p": top_p,
1664 "user": user,
1665 },
1666 completion_create_params.CompletionCreateParams,
1667 ),
1668 options=make_request_options(
1669 extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
1670 ),
1671 cast_to=ChatCompletion,
1672 stream=stream or False,
1673 stream_cls=AsyncStream[ChatCompletionChunk],
1674 )

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1838, in AsyncAPIClient.post(self, path, cast_to, body, files, options, stream, stream_cls)
1835 opts = FinalRequestOptions.construct(
1836 method="post", url=path, json_data=body, files=await async_to_httpx_files(files), **options
1837 )
-> 1838 return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1532, in AsyncAPIClient.request(self, cast_to, options, stream, stream_cls, remaining_retries)
1530 retries_taken = 0
-> 1532 return await self._request(
1533 cast_to=cast_to,
1534 options=options,
1535 stream=stream,
1536 stream_cls=stream_cls,
1537 retries_taken=retries_taken,
1538 )

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\openai_base_client.py:1633, in AsyncAPIClient._request(self, cast_to, options, stream, stream_cls, retries_taken)
1632 log.debug("Re-raising status error")
-> 1633 raise self._make_status_error_from_response(err.response) from None
1635 return await self._process_response(
1636 cast_to=cast_to,
1637 options=options,
(...)
1641 retries_taken=retries_taken,
1642 )

AuthenticationError: Error code: 401 - {'error': {'code': 'PermissionDenied', 'message': 'Principal does not have access to API/Operation.'}}

During handling of the above exception, another exception occurred:

WrappedOpenAIError Traceback (most recent call last)
Cell In[37], line 2
1 # Where the config_path is relative to working_dir
----> 2 msftEval.evaluate.run_evaluate_from_config(working_dir = os.getcwd(),
3 config_path = "./ai-rag-chat-evaluator-main/example_config.json",
4 num_questions = 20,
5 target_url = "http://127.0.0.1:50505/chat")

File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:238, in run_evaluate_from_config(working_dir, config_path, num_questions, target_url)
232 results_dir = working_dir / Path(config["results_dir"])
234 # From Dev to trouleshoot
235 # print(config)
236 # print("service Setup looks like", service_setup.get_openai_config())
--> 238 evaluation_run_complete = run_evaluation(
239 openai_config=service_setup.get_openai_config(),
240 testdata_path=working_dir / Path(config["testdata_path"]),
241 results_dir=results_dir,
242 target_url=target_url or config["target_url"],
243 target_parameters=config.get("target_parameters", {}),
244 num_questions=num_questions,
245 requested_metrics=config.get(
246 "requested_metrics",
247 ["gpt_groundedness", "gpt_relevance",
248 "gpt_coherence", "answer_length", "latency"],
249 ),
250 target_response_answer_jmespath=config.get(
251 "target_response_answer_jmespath"),
252 target_response_context_jmespath=config.get(
253 "target_response_context_jmespath"),
254 )
256 if evaluation_run_complete:
257 results_config_path = results_dir / "config.json"

File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:170, in run_evaluation(openai_config, testdata_path, results_dir, target_url, target_parameters, requested_metrics, num_questions, target_response_answer_jmespath, target_response_context_jmespath)
168 questions_with_ratings = []
169 for row in track(testdata, description="Processing..."):
--> 170 questions_with_ratings.append(evaluate_row(row))
172 logger.info(
173 "Evaluation calls have completed. Calculating overall metrics now...")
174 # Make the results directory if it doesn't exist

File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\evaluate.py:157, in run_evaluation..evaluate_row(row)
155 print(openai_config)
156 print(help(metric.evaluator_fn))
--> 157 result = metric.evaluator_fn(openai_config=openai_config)(
158 query=row["question"],
159 response=output["answer"],
160 context=output["context"],
161 ground_truth=row["truth"],
162 )
163 output.update(result)
165 return output

File c:\tes\Use Case 46\ragEval\ai-rag-chat-evaluator-main\scripts\azureAi_evaluation_evaluators_groundedness_groundedness.py:82, in GroundednessEvaluator.call(self, response, context, conversation, **kwargs)
75 print(dir(self._flow))
76 # print(self._model)
77 # print(self._name)
78 # print(self._parse_prompty)
79 # print(f"response == {response}")
80 # print(f"context == {context}")
81 # print(f"conversation == {conversation}")
---> 82 tmpReturn = super().call(response=response, context=context,
83 conversation=conversation, **kwargs)
84 print("Post call, now return")
85 return tmpReturn

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:65, in EvaluatorBase.call(self, **kwargs)
54 def call(self, **kwargs) -> Dict:
55 """Evaluate a given input. This method serves as a wrapper and is meant to be overridden by child classes for
56 one main reason - to overwrite the method headers and docstring to include additional inputs as needed.
57 The actual behavior of this function shouldn't change beyond adding more inputs to the
(...)
63 :rtype: Dict
64 """
---> 65 return async_run_allowing_running_loop(self._async_evaluator, **kwargs)

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop(async_func, *args, **kwargs)
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))

File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures_base.py:456, in Future.result(self, timeout)
454 raise CancelledError()
455 elif self._state == FINISHED:
--> 456 return self.__get_result()
457 else:
458 raise TimeoutError()

File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures_base.py:401, in Future.__get_result(self)
399 if self._exception:
400 try:
--> 401 raise self._exception
402 finally:
403 # Break a reference cycle with the exception in self._exception
404 self = None

File c:\Users\JDENCH.conda\envs\ragEval\Lib\concurrent\futures\thread.py:58, in _WorkItem.run(self)
55 return
57 try:
---> 58 result = self.fn(*self.args, **self.kwargs)
59 except BaseException as exc:
60 self.future.set_exception(exc)

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow_utils\async_utils.py:94, in async_run_allowing_running_loop..()
92 if _has_running_loop():
93 with ThreadPoolExecutorWithContext() as executor:
---> 94 return executor.submit(lambda: asyncio.run(async_func(*args, **kwargs))).result()
95 else:
96 return asyncio.run(_invoke_async_with_sigint_handler(async_func, *args, **kwargs))

File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\runners.py:190, in run(main, debug)
186 raise RuntimeError(
187 "asyncio.run() cannot be called from a running event loop")
189 with Runner(debug=debug) as runner:
--> 190 return runner.run(main)

File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\runners.py:118, in Runner.run(self, coro, context)
116 self._interrupt_count = 0
117 try:
--> 118 return self._loop.run_until_complete(task)
119 except exceptions.CancelledError:
120 if self._interrupt_count > 0:

File c:\Users\JDENCH.conda\envs\ragEval\Lib\asyncio\base_events.py:653, in BaseEventLoop.run_until_complete(self, future)
650 if not future.done():
651 raise RuntimeError('Event loop stopped before Future completed.')
--> 653 return future.result()

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:302, in AsyncEvaluatorBase.call(self, query, response, context, conversation, **kwargs)
300 if context is not None:
301 kwargs["context"] = context
--> 302 return await self._real_call(**kwargs)

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_eval.py:263, in EvaluatorBase._real_call(self, **kwargs)
261 # Evaluate all inputs.
262 for eval_input in eval_input_list:
--> 263 per_turn_results.append(await self._do_eval(eval_input))
264 # Return results as-is if only one result was produced.
266 if len(per_turn_results) == 1:

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\azure\ai\evaluation_evaluators_common_base_prompty_eval.py:72, in PromptyEvaluatorBase._do_eval(self, eval_input)
61 @OverRide
62 async def _do_eval(self, eval_input: Dict) -> Dict:
63 """Do a relevance evaluation.
64
65 :param eval_input: The input to the evaluator. Expected to contain
(...)
70 :rtype: Dict
71 """
---> 72 llm_output = await self._flow(timeout=self.LLM_CALL_TIMEOUT, **eval_input)
74 score = np.nan
75 if llm_output:

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\tracing_trace.py:488, in _traced_async..wrapped(*args, **kwargs)
486 Tracer.push(trace)
487 enrich_span_with_input(span, trace.inputs)
--> 488 output = await func(*args, **kwargs)
489 output = handle_output(span, trace.inputs, output, trace_type)
490 except Exception as e:

File c:\Users\JDENCH.conda\envs\ragEval\Lib\site-packages\promptflow\core_prompty_utils.py:1219, in handle_openai_error_async..decorator..wrapper(*args, **kwargs)
1217 status_code = e.response.status_code
1218 if status_code < 500 and status_code not in [429, 422]:
-> 1219 raise WrappedOpenAIError(e)
1220 if isinstance(e, RateLimitError) and getattr(e, "type", None) == "insufficient_quota":
1221 # Exit retry if this is quota insufficient error
1222 logger.error(f"{type(e).name} with insufficient quota. Throw user error.")

WrappedOpenAIError: OpenAI API hits AuthenticationError: Principal does not have access to API/Operation. If you are using azure openai connection, please make sure you have proper role assignment on your azure openai resource. You can refer to https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control

Running Information(please complete the following information):

  • Promptflow Package Version: 1.16.1
  • Operating System: Windows (Windows Server 2019 Datacenter)
  • Python Version using python --version: Python 3.11.4

Additional context
I am working from a context where API KEY is not used, my company does not share the API KEY but instead uses a roles based access using the Azure CLI / EntraID credentials and requires that tokens be captured from a call to azure cognitive services. The following is a minimal credentials instantiation script we run in our chatbot

# Use the current user identity to authenticate with Azure OpenAI, AI Search and Blob Storage (no secrets needed,
# just use 'az login' locally, and managed identity when deployed on Azure). If you need to use keys, use separate AzureKeyCredential instances with the
# keys for each service
# If you encounter a blocking error during a DefaultAzureCredential resolution, you can exclude the problematic credential by using a parameter (ex. exclude_shared_token_cache_credential=True)
credential_chain = (
    # Azure CLI as default to test on VM so that the managed identity is not picked up
    AzureCliCredential(),
    # Try Managed Identity second, for the Web App.
    ManagedIdentityCredential()
)
azure_credential = ChainedTokenCredential(*credential_chain)

azure_credential = DefaultAzureCredential(exclude_shared_token_cache_credential=True)

token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default")
# Store on app.config for later use inside requests
openai_client = AsyncAzureOpenAI(
api_version="2024-03-01-preview",#
azure_endpoint=f"https://{AZURE_OPENAI_SERVICE}.openai.azure.com",
azure_ad_token_provider=token_provider,
)

@JonathanDench JonathanDench added the bug Something isn't working label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant