Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 Excessive Agency #418

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
3274c37
Create FUNDING.yml
SClinton Jun 6, 2024
46cbb72
Rename ISOIEC20547-4:2020.md (#401)
rot169 Sep 7, 2024
3715ce0
Create folder for the solutions landscape document to be publisheddoc…
SClinton Sep 7, 2024
f2ab5c6
Create folder for CoE guide
SClinton Sep 7, 2024
a2ac111
Create folder to contain initiative outputs and artifacts
SClinton Sep 7, 2024
c9f30d2
Merge pull request #402 from SClinton/main
virtualsteve-star Sep 13, 2024
9da9245
Added new potential Candidate
xsankar Sep 23, 2024
3059406
Merge pull request #403 from xsankar/main
virtualsteve-star Sep 23, 2024
eea6181
Create readme.md
SClinton Sep 25, 2024
264a7e2
Update readme.md
SClinton Sep 25, 2024
6a26973
Merge pull request #406 from SClinton/patch-6
virtualsteve-star Sep 25, 2024
749910e
Merge pull request #405 from SClinton/patch-4
virtualsteve-star Sep 25, 2024
064fa6f
Update LLM02_InsecureOutputHandling.md (#404)
kenhuangus Sep 26, 2024
f360f53
chore: rename to unbounded consumption (#407)
GangGreenTemperTatum Sep 26, 2024
027b0a4
chore: Ads/merge unbounded consumption v2 (#408)
GangGreenTemperTatum Sep 26, 2024
ac913eb
chore: add rachel to codeowners for pi (#409)
GangGreenTemperTatum Sep 26, 2024
734dc8f
docs: submit backdoor attacks emerging candidate (#411)
GangGreenTemperTatum Sep 27, 2024
7769865
chore: add placeholder emerging candidates (#412)
GangGreenTemperTatum Sep 27, 2024
38b0109
feat: prompt injection v2 2024 list rewrite (#413)
GangGreenTemperTatum Sep 27, 2024
9318e79
Update LLM02_InsecureOutputHandling.md (#415)
kenhuangus Sep 30, 2024
4171a24
Added more details and references (#414)
xsankar Sep 30, 2024
d57e6f8
fix: broken ref links (#416)
GangGreenTemperTatum Oct 1, 2024
55ad15b
feat: ads add system prompt leakage (#417)
GangGreenTemperTatum Oct 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 68 additions & 35 deletions 2_0_vulns/LLM01_PromptInjection.md

Large diffs are not rendered by default.

38 changes: 21 additions & 17 deletions 2_0_vulns/LLM02_InsecureOutputHandling.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,39 @@
## LLM02: Insecure Output Handling


### Description

Insecure Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality.

Insecure Output Handling differs from Overreliance in that it deals with LLM-generated outputs before they are passed downstream whereas Overreliance focuses on broader concerns around overdependence on the accuracy and appropriateness of LLM outputs.

Successful exploitation of an Insecure Output Handling vulnerability can result in XSS and CSRF in web browsers as well as SSRF, privilege escalation, or remote code execution on backend systems.

The following conditions can increase the impact of this vulnerability:
* The application grants the LLM privileges beyond what is intended for end users, enabling escalation of privileges or remote code execution.
* The application is vulnerable to indirect prompt injection attacks, which could allow an attacker to gain privileged access to a target user's environment.
* 3rd party plugins do not adequately validate inputs.

- The application grants the LLM privileges beyond what is intended for end users, enabling escalation of privileges or remote code execution.
- The application is vulnerable to indirect prompt injection attacks, which could allow an attacker to gain privileged access to a target user's environment.
- 3rd party plugins do not adequately validate inputs.
- Lack of proper output encoding for different contexts (e.g., HTML, JavaScript, SQL)
- Insufficient monitoring and logging of LLM outputs
- Absence of rate limiting or anomaly detection for LLM usage
### Common Examples of Vulnerability

1. LLM output is entered directly into a system shell or similar function such as exec or eval, resulting in remote code execution.
2. JavaScript or Markdown is generated by the LLM and returned to a user. The code is then interpreted by the browser, resulting in XSS.
- LLM output is entered directly into a system shell or similar function such as exec or eval, resulting in remote code execution.
- JavaScript or Markdown is generated by the LLM and returned to a user. The code is then interpreted by the browser, resulting in XSS.
- LLM-generated SQL queries are executed without proper parameterization, leading to SQL injection.
- LLM output is used to construct file paths without proper sanitization, potentially resulting in path traversal vulnerabilities.
- LLM-generated content is used in email templates without proper escaping, potentially leading to phishing attacks.

### Prevention and Mitigation Strategies

1. Treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions.
2. Follow the OWASP ASVS (Application Security Verification Standard) guidelines to ensure effective input validation and sanitization.
3. Encode model output back to users to mitigate undesired code execution by JavaScript or Markdown. OWASP ASVS provides detailed guidance on output encoding.
- Treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions.
- Follow the OWASP ASVS (Application Security Verification Standard) guidelines to ensure effective input validation and sanitization.
- Encode model output back to users to mitigate undesired code execution by JavaScript or Markdown. OWASP ASVS provides detailed guidance on output encoding.
- Implement context-aware output encoding based on where the LLM output will be used (e.g., HTML encoding for web content, SQL escaping for database queries).
- Use parameterized queries or prepared statements for all database operations involving LLM output.
- Employ strict Content Security Policies (CSP) to mitigate the risk of XSS attacks from LLM-generated content.
- Implement robust logging and monitoring systems to detect unusual patterns in LLM outputs that might indicate exploitation attempts.

### Example Attack Scenarios

1. An application utilizes an LLM plugin to generate responses for a chatbot feature. The plugin also offers a number of administrative functions accessible to another privileged LLM. The general purpose LLM directly passes its response, without proper output validation, to the plugin causing the plugin to shut down for maintenance.
2. A user utilizes a website summarizer tool powered by an LLM to generate a concise summary of an article. The website includes a prompt injection instructing the LLM to capture sensitive content from either the website or from the user's conversation. From there the LLM can encode the sensitive data and send it, without any output validation or filtering, to an attacker-controlled server.
3. An LLM allows users to craft SQL queries for a backend database through a chat-like feature. A user requests a query to delete all database tables. If the crafted query from the LLM is not scrutinized, then all database tables will be deleted.
4. A web app uses an LLM to generate content from user text prompts without output sanitization. An attacker could submit a crafted prompt causing the LLM to return an unsanitized JavaScript payload, leading to XSS when rendered on a victim's browser. Insufficient validation of prompts enabled this attack.
5. An LLM is used to generate dynamic email templates for a marketing campaign. An attacker manipulates the LLM to include malicious JavaScript within the email content. If the application doesn't properly sanitize the LLM output, this could lead to XSS attacks on recipients who view the email in vulnerable email clients.
6: An LLM is used to generate code from natural language inputs in a software company, aiming to streamline development tasks. While efficient, this approach risks exposing sensitive information, creating insecure data handling methods, or introducing vulnerabilities like SQL injection. The AI may also hallucinate non-existent software packages, potentially leading developers to download malware-infected resources. Thorough code review and verification of suggested packages are crucial to prevent security breaches, unauthorized access, and system compromises.

### Reference Links

Expand All @@ -40,3 +43,4 @@ The following conditions can increase the impact of this vulnerability:
4. [Don’t blindly trust LLM responses. Threats to chatbots](https://embracethered.com/blog/posts/2023/ai-injections-threats-context-matters/): **Embrace The Red**
5. [Threat Modeling LLM Applications](https://aivillage.org/large%20language%20models/threat-modeling-llm/): **AI Village**
6. [OWASP ASVS - 5 Validation, Sanitization and Encoding](https://owasp-aasvs4.readthedocs.io/en/latest/V5.html#validation-sanitization-and-encoding): **OWASP AASVS**
7. [AI hallucinates software packages and devs download them – even if potentially poisoned with malware](https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/) **Theregiste**
49 changes: 0 additions & 49 deletions 2_0_vulns/LLM04_ModelDoS.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Unrestricted Model Inference
## Unbounded Consumption

**Author(s):** [Ads - GangGreenTemperTatum](https://github.com/GangGreenTemperTatum)
<br>
Expand All @@ -10,9 +10,9 @@

### Description

Unrestricted Model Inference refers to the process where a Large Language Model (LLM) generates outputs based on input queries or prompts. Inference is a critical function of LLMs, involving the application of learned patterns and knowledge to produce relevant responses or predictions.
Unbounded Consumption refers to the process where a Large Language Model (LLM) generates outputs based on input queries or prompts. Inference is a critical function of LLMs, involving the application of learned patterns and knowledge to produce relevant responses or predictions.

Unrestricted Model Inference occurs when a Large Language Model (LLM) application allows users to conduct excessive and uncontrolled inferences, leading to potential risks such as denial of service (DoS), economic losses, model or intellectual property theft theft, and degradation of service. This vulnerability is exacerbated by the high computational demands of LLMs, often deployed in cloud environments, making them susceptible to various forms of resource exploitation and unauthorized usage.
Unbounded Consumption occurs when a Large Language Model (LLM) application allows users to conduct excessive and uncontrolled inferences, leading to potential risks such as denial of service (DoS), economic losses, model or intellectual property theft theft, and degradation of service. This vulnerability is exacerbated by the high computational demands of LLMs, often deployed in cloud environments, making them susceptible to various forms of resource exploitation and unauthorized usage.

### Common Examples of Vulnerability

Expand Down
Loading