-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat: add GraySwan Guardrails support #15756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ishaan-jaff
merged 1 commit into
BerriAI:main
from
uc4w6c:feat/add_grayswan_guardrails_support
Oct 21, 2025
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,147 @@ | ||
| import Tabs from '@theme/Tabs'; | ||
| import TabItem from '@theme/TabItem'; | ||
|
|
||
| # GraySwan Cygnal Guardrail | ||
|
|
||
| Use [GraySwan Cygnal](https://docs.grayswan.ai/cygnal/monitor-requests) to continuously monitor conversations for policy violations, indirect prompt injection (IPI), jailbreak attempts, and other safety risks. | ||
|
|
||
| Cygnal returns a `violation` score between `0` and `1` (higher means more likely to violate policy), plus metadata such as violated rule indices, mutation detection, and IPI flags. LiteLLM can automatically block or monitor requests based on this signal. | ||
|
|
||
| --- | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### 1. Obtain Credentials | ||
|
|
||
| 1. Create a GraySwan account and generate a Cygnal API key. | ||
| 2. Configure environment variables for the LiteLLM proxy host: | ||
|
|
||
| ```bash | ||
| export GRAYSWAN_API_KEY="your-grayswan-key" | ||
| ``` | ||
|
|
||
| ### 2. Configure `config.yaml` | ||
|
|
||
| Add a guardrail entry that references the GraySwan integration. Below is a balanced example that monitors both input and output but only blocks once the violation score reaches the configured threshold. | ||
|
|
||
| ```yaml | ||
| model_list: | ||
| - model_name: openai/gpt-4.1-mini | ||
| litellm_params: | ||
| model: openai/gpt-4.1-mini | ||
| api_key: os.environ/OPENAI_API_KEY | ||
|
|
||
| guardrails: | ||
| - guardrail_name: "cygnal-monitor" | ||
| litellm_params: | ||
| guardrail: grayswan | ||
| mode: [pre_call, post_call] # monitor both input and output | ||
| api_key: os.environ/GRAYSWAN_API_KEY | ||
| optional_params: | ||
| on_flagged_action: monitor # or "block" | ||
| violation_threshold: 0.5 # score >= threshold is flagged | ||
| reasoning_mode: hybrid # off | hybrid | thinking | ||
| categories: | ||
| safety: "Detect jailbreaks and policy violations" | ||
| policy_id: "your-cygnal-policy-id" | ||
| default_on: true | ||
|
|
||
| general_settings: | ||
| master_key: "your-litellm-master-key" | ||
|
|
||
| litellm_settings: | ||
| set_verbose: true | ||
| ``` | ||
|
|
||
| ### 3. Launch the Proxy | ||
|
|
||
| ```bash | ||
| litellm --config config.yaml --port 4000 | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Choosing Guardrail Modes | ||
|
|
||
| GraySwan can run during `pre_call`, `during_call`, and `post_call` stages. Combine modes based on your latency and coverage requirements. | ||
|
|
||
| | Mode | When it Runs | Protects | Typical Use Case | | ||
| |--------------|-------------------|-----------------------|------------------| | ||
| | `pre_call` | Before LLM call | User input only | Block prompt injection before it reaches the model | | ||
| | `during_call`| Parallel to call | User input only | Low-latency monitoring without blocking | | ||
| | `post_call` | After response | Full conversation | Scan output for policy violations, leaked secrets, or IPI | | ||
|
|
||
| <Tabs> | ||
| <TabItem value="monitor" label="Monitor Only"> | ||
|
|
||
| ```yaml | ||
| guardrails: | ||
| - guardrail_name: "cygnal-monitor-only" | ||
| litellm_params: | ||
| guardrail: grayswan | ||
| mode: "during_call" | ||
| api_key: os.environ/GRAYSWAN_API_KEY | ||
| optional_params: | ||
| on_flagged_action: monitor | ||
| violation_threshold: 0.6 | ||
| default_on: true | ||
| ``` | ||
|
|
||
| Best for visibility without blocking. Alerts are logged via LiteLLM’s standard logging callbacks. | ||
|
|
||
| </TabItem> | ||
| <TabItem value="block-input" label="Block Input"> | ||
|
|
||
| ```yaml | ||
| guardrails: | ||
| - guardrail_name: "cygnal-block-input" | ||
| litellm_params: | ||
| guardrail: grayswan | ||
| mode: "pre_call" | ||
| api_key: os.environ/GRAYSWAN_API_KEY | ||
| optional_params: | ||
| on_flagged_action: block | ||
| violation_threshold: 0.4 | ||
| categories: | ||
| pii: "Detect sensitive data" | ||
| default_on: true | ||
| ``` | ||
|
|
||
| Stops malicious or sensitive prompts before any tokens are generated. | ||
|
|
||
| </TabItem> | ||
| <TabItem value="full-coverage" label="Full Coverage"> | ||
|
|
||
| ```yaml | ||
| guardrails: | ||
| - guardrail_name: "cygnal-full-coverage" | ||
| litellm_params: | ||
| guardrail: grayswan | ||
| mode: [pre_call, post_call] | ||
| api_key: os.environ/GRAYSWAN_API_KEY | ||
| optional_params: | ||
| on_flagged_action: block | ||
| violation_threshold: 0.5 | ||
| reasoning_mode: thinking | ||
| policy_id: "policy-id-from-grayswan" | ||
| default_on: true | ||
| ``` | ||
|
|
||
| Provides the strongest enforcement by inspecting both prompts and responses. | ||
|
|
||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| --- | ||
|
|
||
| ## Configuration Reference | ||
|
|
||
| | Parameter | Type | Description | | ||
| |---------------------------------------|-----------------|-------------| | ||
| | `api_key` | string | GraySwan Cygnal API key. Reads from `GRAYSWAN_API_KEY` if omitted. | | ||
| | `mode` | string or list | Guardrail stages (`pre_call`, `during_call`, `post_call`). | | ||
| | `optional_params.on_flagged_action` | string | `monitor` (log only) or `block` (raise `HTTPException`). | | ||
| | `.optional_params.violation_threshold`| number (0-1) | Scores at or above this value are considered violations. | | ||
| | `optional_params.reasoning_mode` | string | `off`, `hybrid`, or `thinking`. Enables Cygnal’s reasoning capabilities. | | ||
| | `optional_params.categories` | object | Map of custom category names to descriptions. | | ||
| | `optional_params.policy_id` | string | GraySwan policy identifier. | | ||
74 changes: 74 additions & 0 deletions
74
litellm/proxy/guardrails/guardrail_hooks/grayswan/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| """GraySwan Cygnal guardrail integration for LiteLLM.""" | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| from litellm.types.guardrails import SupportedGuardrailIntegrations | ||
|
|
||
| from .grayswan import ( | ||
| GraySwanGuardrail, | ||
| GraySwanGuardrailAPIError, | ||
| GraySwanGuardrailMissingSecrets, | ||
| ) | ||
|
|
||
| if TYPE_CHECKING: | ||
| from litellm.types.guardrails import Guardrail, LitellmParams | ||
|
|
||
|
|
||
| def initialize_guardrail( | ||
| litellm_params: "LitellmParams", guardrail: "Guardrail" | ||
| ) -> GraySwanGuardrail: | ||
| import litellm | ||
|
|
||
| guardrail_name = guardrail.get("guardrail_name") | ||
| if not guardrail_name: | ||
| raise ValueError("GraySwan guardrail requires a guardrail_name") | ||
|
|
||
| optional_params = getattr(litellm_params, "optional_params", None) | ||
|
|
||
| grayswan_guardrail = GraySwanGuardrail( | ||
| guardrail_name=guardrail_name, | ||
| api_key=litellm_params.api_key, | ||
| api_base=litellm_params.api_base, | ||
| on_flagged_action=_get_config_value( | ||
| litellm_params, optional_params, "on_flagged_action" | ||
| ), | ||
| violation_threshold=_get_config_value( | ||
| litellm_params, optional_params, "violation_threshold" | ||
| ), | ||
| reasoning_mode=_get_config_value( | ||
| litellm_params, optional_params, "reasoning_mode" | ||
| ), | ||
| categories=_get_config_value(litellm_params, optional_params, "categories"), | ||
| policy_id=_get_config_value(litellm_params, optional_params, "policy_id"), | ||
| event_hook=litellm_params.mode, | ||
| default_on=litellm_params.default_on, | ||
| ) | ||
|
|
||
| litellm.logging_callback_manager.add_litellm_callback(grayswan_guardrail) | ||
| return grayswan_guardrail | ||
|
|
||
|
|
||
| def _get_config_value(litellm_params, optional_params, attribute_name): | ||
| if optional_params is not None: | ||
| value = getattr(optional_params, attribute_name, None) | ||
| if value is not None: | ||
| return value | ||
| return getattr(litellm_params, attribute_name, None) | ||
|
|
||
|
|
||
| guardrail_initializer_registry = { | ||
| SupportedGuardrailIntegrations.GRAYSWAN.value: initialize_guardrail, | ||
| } | ||
|
|
||
|
|
||
| guardrail_class_registry = { | ||
| SupportedGuardrailIntegrations.GRAYSWAN.value: GraySwanGuardrail, | ||
| } | ||
|
|
||
|
|
||
| __all__ = [ | ||
| "GraySwanGuardrail", | ||
| "GraySwanGuardrailAPIError", | ||
| "GraySwanGuardrailMissingSecrets", | ||
| "initialize_guardrail", | ||
| ] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use "Gray Swan" throughout the doc instead of "GraySwan"