Skip to content

Commit

Permalink
grammatical fixes (#211)
Browse files Browse the repository at this point in the history
* fixing typo

* fixing grammar on LLMs
  • Loading branch information
rossja authored Oct 15, 2023
1 parent 9083191 commit 524df95
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
5 changes: 4 additions & 1 deletion 1_1_vulns/InsecureOutputHandling.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
## LLM02: Insecure Output Handling


### Description

Insecure Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems. Since LLM-generated content can be controlled by prompt input, this behavior is similar to providing users indirect access to additional functionality.

Insecure Output Handling differs from Overreliance in that it deals with LLM-generated outputs before they are passed downstream whereas Overreliance focuses on broader concerns around overdependence on the accuracy and appropriateness of LLM outputs.
Expand All @@ -9,7 +12,7 @@ Successful exploitation of an Insecure Output Handling vulnerability can result
The following conditions can increase the impact of this vulnerability:
* The application grants the LLM privileges beyond what is intended for end users, enabling escalation of privileges or remote code execution.
* The application is vulnerable to indirect prompt injection attacks, which could allow an attacker to gain privileged access to a target user's environment.
* 3rtd party plugins do not adequately validate inputs.
* 3rd party plugins do not adequately validate inputs.

### Common Examples of Vulnerability

Expand Down
4 changes: 2 additions & 2 deletions 1_1_vulns/TrainingDataPoisoning.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Data poisoning is considered an integrity attack because tampering with the trai
7. Testing and Detection, by measuring the loss during the training stage and analyzing trained models to detect signs of a poisoning attack by analyzing model behavior on specific test inputs.
1. Monitoring and alerting on number of skewed responses exceeding a threshold.
2. Use of a human loop to review responses and auditing.
3. Implement dedicated LLM's to benchmark against undesired consequences and train other LLM's using [reinforcement learning techniques](https://wandb.ai/ayush-thakur/Intro-RLAIF/reports/An-Introduction-to-Training-LLMs-Using-Reinforcement-Learning-From-Human-Feedback-RLHF---VmlldzozMzYyNjcy).
3. Implement dedicated LLMs to benchmark against undesired consequences and train other LLMs using [reinforcement learning techniques](https://wandb.ai/ayush-thakur/Intro-RLAIF/reports/An-Introduction-to-Training-LLMs-Using-Reinforcement-Learning-From-Human-Feedback-RLHF---VmlldzozMzYyNjcy).
4. Perform LLM-based [red team exercises](https://www.anthropic.com/index/red-teaming-language-models-to-reduce-harms-methods-scaling-behaviors-and-lessons-learned) or [LLM vulnerability scanning](https://github.com/leondz/garak) into the testing phases of the LLM's lifecycle.

### Example Attack Scenarios
Expand All @@ -59,4 +59,4 @@ Data poisoning is considered an integrity attack because tampering with the trai
8. [FedMLSecurity:arXiv:2306.04959](https://arxiv.org/abs/2306.04959): **Arxiv White Paper**
9. [The poisoning of ChatGPT](https://softwarecrisis.dev/letters/the-poisoning-of-chatgpt/): **Software Crisis Blog**
10. [Poisoning Web-Scale Training Datasets - Nicholas Carlini | Stanford MLSys #75](https://www.youtube.com/watch?v=h9jf1ikcGyk): **YouTube Video**
11. [OWASP CycloneDX v1.5](https://cyclonedx.org/capabilities/mlbom/): **OWASP CycloneDX**
11. [OWASP CycloneDX v1.5](https://cyclonedx.org/capabilities/mlbom/): **OWASP CycloneDX**

0 comments on commit 524df95

Please sign in to comment.