Skip to content

Refactor rlsapi v1 infer endpoint to deduplicate error handling#1249

Merged
tisnik merged 2 commits intolightspeed-core:mainfrom
major:major/rlsapi-v1-simplify-types
Mar 2, 2026
Merged

Refactor rlsapi v1 infer endpoint to deduplicate error handling#1249
tisnik merged 2 commits intolightspeed-core:mainfrom
major:major/rlsapi-v1-simplify-types

Conversation

@major
Copy link
Contributor

@major major commented Mar 2, 2026

Description

  • Cleaned up src/app/endpoints/rlsapi_v1.py so /infer is easier to follow.
  • Pulled repeated exception mapping into a helper (_map_inference_error_to_http_exception) instead of repeating nearly identical except blocks.
  • Kept failure telemetry in one place so we don’t have to update multiple paths later.
  • Added small helpers for prompt/model fallback logic to cut down on inline branching noise.
  • Kept behavior the same for known failures (context_length, connection issues, rate limits, API status errors).
  • Unknown RuntimeErrors still get re-raised (same behavior), just through a cleaner flow.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

  • Assisted-by: OpenCode
  • Generated by: openai/gpt-5.3-codex

Related Tickets & Documents

  • Related Issue # N/A
  • Closes # N/A

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • uv run pytest tests/unit/app/endpoints/test_rlsapi_v1.py
  • uv run pyright src/app/endpoints/rlsapi_v1.py
  • uv run make verify

Summary by CodeRabbit

  • New Features

    • Allow explicit model selection for inference requests and fall back to a sensible configured default.
    • Use the configured/default system prompt as a fallback when building instructions.
  • Bug Fixes

    • Unified and improved error handling for inference failures, mapping backend errors to appropriate HTTP responses.
    • Returns HTTP 503 only when no default model is configured.
  • Telemetry

    • Telemetry consistently reports the configured default model name.
  • Public API

    • retrieve_simple_response now accepts an optional model_id parameter.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

Warning

Rate limit exceeded

@major has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 8 minutes and 14 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9241e25 and 50cc05d.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py

Walkthrough

Centralizes inference exception handling, adds helpers for base prompt and configured default model naming, changes model resolution (retrieve_simple_response accepts optional model_id and raises 503 when no default model), and routes inference failures through unified mapping and telemetry. (49 words)

Changes

Cohort / File(s) Summary
Inference endpoint & helpers
src/app/endpoints/rlsapi_v1.py
Added _INFER_HANDLED_EXCEPTIONS, _get_base_prompt(), _get_configured_default_model_name(), and _map_inference_error_to_http_exception(); replaced verbose inline try/except in infer_endpoint with unified handling that records failures and maps backend errors to HTTPException.
Model resolution & retrieval
src/app/endpoints/rlsapi_v1.py
Updated retrieve_simple_response(...) signature to accept `model_id: str
Telemetry / event payloads
src/app/endpoints/rlsapi_v1.py
Telemetry/event queuing now uses _get_configured_default_model_name() for the model field; infer_endpoint types mcp_tools and passes resolved model_id through to retrieval and failure recording.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as Client
  participant API as API (/infer)
  participant Builder as _build_instructions/_get_base_prompt
  participant Retriever as retrieve_simple_response
  participant Backend as ModelBackend
  participant Telemetry as _queue_splunk_event
  participant Recorder as _record_inference_failure

  Client->>API: POST /infer (question, options)
  API->>Builder: build instructions (uses _get_base_prompt)
  API->>Retriever: call retrieve_simple_response(..., model_id)
  Retriever->>Backend: invoke model inference
  alt success
    Backend-->>Retriever: model response
    Retriever-->>API: return text
    API->>Telemetry: enqueue event (uses _get_configured_default_model_name)
    API-->>Client: 200 OK (response)
  else handled error
    Backend-->>Retriever: raises exception
    Retriever-->>API: propagate error
    API->>Recorder: _record_inference_failure(request_id, model_id, error)
    API->>API: _map_inference_error_to_http_exception(error, model_id, request_id)
    API-->>Client: mapped HTTPException (e.g., 429/503/502)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and clearly summarizes the main change: refactoring the rlsapi v1 infer endpoint to deduplicate error handling, which matches the core objective of extracting repeated exception-to-HTTP mapping into a helper function.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/app/endpoints/rlsapi_v1.py (1)

128-135: Expand new helper docstrings to the project-required format.

The new helper docstrings are brief but incomplete; they should include full Google-style sections (Parameters/Returns/Raises as applicable).

As per coding guidelines: "All functions must have complete docstrings with brief descriptions" and "Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections".

Also applies to: 220-224, 296-301

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 128 - 135, The _get_base_prompt
helper currently has a minimal docstring; expand it to a full Google-style
docstring for functions: include a one-line summary, a "Returns" section
specifying it returns a str (the resolved system prompt), a "Parameters" section
(explicitly state there are none) and a "Raises" section if applicable (or state
none), and place it directly above def _get_base_prompt so tooling and linters
pick it up; apply the same Google-style docstring expansion to the other small
helper functions in this module that the reviewer flagged so all helper
functions follow the project's docstring convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 302-304: The runtime error check that looks for "context_length"
is too narrow and should detect variants like "context length" (and be
case-insensitive) so prompt-too-long errors return 413 instead of 500; in the
block that inspects error (the branch using variables error, logger and
request_id in this file) replace the single-string check with a broader test
(e.g., normalize the message to lower-case and test for regex r"context[_
]length" or check both "context_length" and "context length" or replace
underscores with spaces before matching) and keep the existing logger.error call
and the code path that returns/raises the 413 HTTP response so both underscore
and spaced variants are handled.

---

Nitpick comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 128-135: The _get_base_prompt helper currently has a minimal
docstring; expand it to a full Google-style docstring for functions: include a
one-line summary, a "Returns" section specifying it returns a str (the resolved
system prompt), a "Parameters" section (explicitly state there are none) and a
"Raises" section if applicable (or state none), and place it directly above def
_get_base_prompt so tooling and linters pick it up; apply the same Google-style
docstring expansion to the other small helper functions in this module that the
reviewer flagged so all helper functions follow the project's docstring
convention.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6528bc2 and d0aa4cd.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py

@major major force-pushed the major/rlsapi-v1-simplify-types branch from d0aa4cd to 0c8bdb9 Compare March 2, 2026 15:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/app/endpoints/rlsapi_v1.py (1)

128-135: Add complete Google-style docstrings for the new helper functions.

Both helpers currently use brief one-liners; they should include explicit Returns sections to match project standards.

📝 Suggested docstring update
 def _get_base_prompt() -> str:
-    """Get the base system prompt with configuration fallback."""
+    """Get the base system prompt.
+
+    Returns:
+        Configured custom system prompt when available; otherwise the default
+        system prompt constant.
+    """
     if (
         configuration.customization is not None
         and configuration.customization.system_prompt is not None
     ):
         return configuration.customization.system_prompt
     return constants.DEFAULT_SYSTEM_PROMPT
@@
 def _get_configured_default_model_name() -> str:
-    """Get configured default model name for telemetry payloads."""
+    """Get configured default model name for telemetry payloads.
+
+    Returns:
+        The configured default model name, or an empty string when no inference
+        configuration/default model is present.
+    """
     if configuration.inference is None:
         return ""
     return configuration.inference.default_model or ""

As per coding guidelines, "All functions must have complete docstrings with brief descriptions" and "Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections".

Also applies to: 220-225

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 128 - 135, The helper function
_get_base_prompt and the other helper referenced at lines 220-225 need full
Google-style docstrings: replace the one-line docstrings with multi-line
docstrings that include a short description, a Returns section specifying the
return type (str) and what it contains (the system prompt, falling back to
constants.DEFAULT_SYSTEM_PROMPT), and a Raises section if applicable (or omit if
none). Locate _get_base_prompt and the other helper, update their docstrings to
follow the project convention (summary line, blank line, Returns: str:
description), and ensure formatting matches existing Google-style docstrings in
the repo.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 329-333: The call to handle_known_apistatus_errors(error,
model_id) can return None, so before calling error_response.model_dump() you
must guard against a None result: call handle_known_apistatus_errors in the
APIStatusError/OpenAIAPIStatusError branch, check if error_response is not None
and only then return HTTPException(**error_response.model_dump()); if it is
None, re-raise the original error or return a generic HTTPException (e.g., with
a 502/500 status and the original error message) so that an AttributeError is
not raised—update the code around logger.exception, request_id, error,
handle_known_apistatus_errors, and model_dump accordingly.

---

Nitpick comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 128-135: The helper function _get_base_prompt and the other helper
referenced at lines 220-225 need full Google-style docstrings: replace the
one-line docstrings with multi-line docstrings that include a short description,
a Returns section specifying the return type (str) and what it contains (the
system prompt, falling back to constants.DEFAULT_SYSTEM_PROMPT), and a Raises
section if applicable (or omit if none). Locate _get_base_prompt and the other
helper, update their docstrings to follow the project convention (summary line,
blank line, Returns: str: description), and ensure formatting matches existing
Google-style docstrings in the repo.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0aa4cd and 0c8bdb9.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py

@tisnik
Copy link
Contributor

tisnik commented Mar 2, 2026

@major please rebase first

Signed-off-by: Major Hayden <major@redhat.com>
@major major force-pushed the major/rlsapi-v1-simplify-types branch from 0c8bdb9 to 27894f8 Compare March 2, 2026 17:05
@major
Copy link
Contributor Author

major commented Mar 2, 2026

@tisnik Done! ✅

@major major force-pushed the major/rlsapi-v1-simplify-types branch from 27894f8 to 9241e25 Compare March 2, 2026 17:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/endpoints/rlsapi_v1.py (1)

203-216: ⚠️ Potential issue | 🔴 Critical

Use resolved_model_id for usage extraction to avoid None and fix CI failure.

At Line 215, extract_token_usage receives model_id (str | None) instead of the resolved non-optional value. This matches the reported Pyright failure and can also produce incorrect telemetry when model_id is omitted.

🐛 Proposed fix
-    extract_token_usage(response.usage, model_id)
+    extract_token_usage(response.usage, resolved_model_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 203 - 216, The code calls
extract_token_usage(response.usage, model_id) with the optional model_id which
can be None; update the call to pass the resolved non-optional resolved_model_id
instead so usage extraction always gets the actual model string. Locate the
block where resolved_model_id is set and the client.responses.create call
(symbols: resolved_model_id, model_id, extract_token_usage) and replace the
second argument to extract_token_usage with resolved_model_id.
🧹 Nitpick comments (2)
src/app/endpoints/rlsapi_v1.py (2)

181-182: Use Optional[...] for optional annotations to match repo typing convention.

The changed signatures use str | None / HTTPException | None, while the repo guideline asks for Optional[Type] for optionals.

🔧 Proposed typing alignment
-from typing import Annotated, Any, cast
+from typing import Annotated, Any, Optional, cast
...
-    model_id: str | None = None,
+    model_id: Optional[str] = None,
...
-) -> HTTPException | None:
+) -> Optional[HTTPException]:

As per coding guidelines: "Use Optional[Type] for optional type annotations."

Also applies to: 300-302

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 181 - 182, Change all
occurrences of union-style optional annotations like "model_id: str | None =
None" and "some_exc: HTTPException | None = None" to use typing.Optional (e.g.,
"model_id: Optional[str] = None", "some_exc: Optional[HTTPException] = None");
add "from typing import Optional" to the imports if missing; update any other
signatures in the same module that use "Type | None" (per the review note also
the occurrences around the HTTPException usage) to follow this Optional[...]
convention and run type checks to ensure no further edits are needed.

132-139: Align new helper docstrings with the repository’s Google-style format.

The new helper docstrings are brief, but they omit required sections (e.g., Args/Returns/Raises) expected by the project style.

📝 Suggested docstring shape
 def _get_base_prompt() -> str:
-    """Get the base system prompt with configuration fallback."""
+    """Get the base system prompt.
+
+    Returns:
+        The configured system prompt when available; otherwise the default
+        prompt constant.
+    """

As per coding guidelines: "Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with Parameters, Returns, Raises, and Attributes sections."

Also applies to: 225-229, 300-308

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 132 - 139, The _get_base_prompt
helper uses a brief docstring that doesn't follow the project's Google-style
conventions; update its docstring to include a short summary plus explicit Args
(if any), Returns (str), and Raises (if applicable) sections per the Google
Python docstring guide, and apply the same structured format to the other new
helper functions in this module so all helper docstrings include
Parameters/Returns/Raises as required by project style.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 203-216: The code calls extract_token_usage(response.usage,
model_id) with the optional model_id which can be None; update the call to pass
the resolved non-optional resolved_model_id instead so usage extraction always
gets the actual model string. Locate the block where resolved_model_id is set
and the client.responses.create call (symbols: resolved_model_id, model_id,
extract_token_usage) and replace the second argument to extract_token_usage with
resolved_model_id.

---

Nitpick comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 181-182: Change all occurrences of union-style optional
annotations like "model_id: str | None = None" and "some_exc: HTTPException |
None = None" to use typing.Optional (e.g., "model_id: Optional[str] = None",
"some_exc: Optional[HTTPException] = None"); add "from typing import Optional"
to the imports if missing; update any other signatures in the same module that
use "Type | None" (per the review note also the occurrences around the
HTTPException usage) to follow this Optional[...] convention and run type checks
to ensure no further edits are needed.
- Around line 132-139: The _get_base_prompt helper uses a brief docstring that
doesn't follow the project's Google-style conventions; update its docstring to
include a short summary plus explicit Args (if any), Returns (str), and Raises
(if applicable) sections per the Google Python docstring guide, and apply the
same structured format to the other new helper functions in this module so all
helper docstrings include Parameters/Returns/Raises as required by project
style.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c8bdb9 and 27894f8.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/app/endpoints/rlsapi_v1.py (2)

53-61: Add a module-level type alias for handled exception tuple.

This constant is clear, but it should use an explicit type alias for readability and consistency with the project standard.

♻️ Proposed refactor
+InferenceHandledExceptionTypes = tuple[type[Exception], ...]
+
-_INFER_HANDLED_EXCEPTIONS = (
+_INFER_HANDLED_EXCEPTIONS: InferenceHandledExceptionTypes = (
     RuntimeError,
     APIConnectionError,
     RateLimitError,
     APIStatusError,
     OpenAIAPIStatusError,
 )
As per coding guidelines: `Define type aliases at module level for clarity`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 53 - 61, Create a module-level
type alias (e.g., HandledExceptions) for the exception-tuple type and annotate
_INFER_HANDLED_EXCEPTIONS with it; add the necessary typing imports (Type and
Tuple) and declare HandledExceptions = Tuple[Type[BaseException], ...] (or
similar) at top-level, then change the _INFER_HANDLED_EXCEPTIONS declaration to
use that alias to improve readability and follow project convention.

132-139: Upgrade new helper docstrings to full Google style sections.

These new functions have docstrings, but they don’t follow the required Google format (Parameters/Returns/Raises where applicable).

As per coding guidelines: All functions require docstrings with brief descriptions and Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with Parameters, Returns, Raises, and Attributes sections.

Also applies to: 225-229, 300-308

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/rlsapi_v1.py` around lines 132 - 139, The _get_base_prompt
function's docstring must be converted to Google-style: keep the one-line
summary, add a "Returns" section specifying it returns a str (the selected
system prompt), add a "Raises" section only if the function can raise exceptions
(otherwise explicitly state "None"), and include a "Parameters" section
indicating there are no parameters (or omit if style allows but prefer explicit
"None"); apply the same Google-style docstring changes to the other new helper
functions in this file referenced by the review so each has a brief description
plus Parameters/Returns/Raises sections and any relevant type information.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 177-182: The function signature for retrieve_simple_response (and
other places in this file where the union-with-None syntax is used, e.g., the
parameter at the later occurrence around line 302) should use typing.Optional
instead of the `| None` syntax to follow project conventions; update signatures
such as `tools: list[Any] | None` to `tools: Optional[list[Any]]` and `model_id:
str | None` to `model_id: Optional[str]`, import Optional from typing at the top
if not already present, and apply the same replacement for any other parameters
in this file that use `| None`.

---

Nitpick comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 53-61: Create a module-level type alias (e.g., HandledExceptions)
for the exception-tuple type and annotate _INFER_HANDLED_EXCEPTIONS with it; add
the necessary typing imports (Type and Tuple) and declare HandledExceptions =
Tuple[Type[BaseException], ...] (or similar) at top-level, then change the
_INFER_HANDLED_EXCEPTIONS declaration to use that alias to improve readability
and follow project convention.
- Around line 132-139: The _get_base_prompt function's docstring must be
converted to Google-style: keep the one-line summary, add a "Returns" section
specifying it returns a str (the selected system prompt), add a "Raises" section
only if the function can raise exceptions (otherwise explicitly state "None"),
and include a "Parameters" section indicating there are no parameters (or omit
if style allows but prefer explicit "None"); apply the same Google-style
docstring changes to the other new helper functions in this file referenced by
the review so each has a brief description plus Parameters/Returns/Raises
sections and any relevant type information.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27894f8 and 9241e25.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py

Signed-off-by: Major Hayden <major@redhat.com>
@major major force-pushed the major/rlsapi-v1-simplify-types branch from 9241e25 to 50cc05d Compare March 2, 2026 17:27
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 37829bf into lightspeed-core:main Mar 2, 2026
20 of 21 checks passed
@major major deleted the major/rlsapi-v1-simplify-types branch March 2, 2026 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants