Skip to content

Commit

Permalink
chore: restructure for new 2024 list (#447)
Browse files Browse the repository at this point in the history
  • Loading branch information
GangGreenTemperTatum authored Oct 23, 2024
1 parent fe01e13 commit 56b2e82
Show file tree
Hide file tree
Showing 84 changed files with 197 additions and 302 deletions.
107 changes: 50 additions & 57 deletions 2_0_vulns/LLM01_PromptInjection.md

Large diffs are not rendered by default.

File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## LLM03: Data and Model Poisoning
## LLM04: Data and Model Poisoning

### Description

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
11 changes: 7 additions & 4 deletions 2_0_vulns/LLM09_Misinformation.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Misinformation
## LLM09:2025 Misinformation

### Description

Expand All @@ -16,22 +16,24 @@ A related issue is overreliance. Overreliance occurs when users place excessive
4. **Unsafe Code Generation:** The model suggests insecure or non-existent code libraries, which can introduce vulnerabilities when integrated into software systems. For example, LLMs propose using insecure third-party libraries, which, if trusted without verification, leads to security risks ([Lasso](https://www.lasso.security/blog/ai-package-hallucinations)).

### Prevention and Mitigation Strategies

1. **Retrieval-Augmented Generation (RAG):** Use Retrieval-Augmented Generation to enhance the reliability of model outputs by retrieving relevant and verified information from trusted external databases during response generation. This helps mitigate the risk of hallucinations and misinformation.
2. **Model Fine-Tuning:** Enhance the model with fine-tuning or embeddings to improve output quality. Techniques such as parameter-efficient tuning (PET) and chain-of-thought prompting can help reduce the incidence of misinformation.
3. **Cross-Verification and Human Oversight:** Encourage users to cross-check LLM outputs with trusted external sources to ensure the accuracy of the information. Implement human oversight and fact-checking processes, especially for critical or sensitive information. Ensure that human reviewers are properly trained to avoid overreliance on AI-generated content.
4. **Automatic Validation Mechanisms:** Implement tools and processes to automatically validate key outputs, especially output from high-stakes environments.
5. **Risk Communication:** Identify the risks and possible harms associated with LLM-generated content, then clearly communicate these risks and limitations to users, including the potential for misinformation.
6. **Secure Coding Practices:** Establish secure coding practices to prevent the integration of vulnerabilities due to incorrect code suggestions.
7. **User Interface Design:** Design APIs and user interfaces that encourage responsible use of LLMs, such as integrating content filters, clearly labeling AI-generated content and informing users on limitations of reliability and accuracy. Be specific about the intended field of use limitations.
7. **User Interface Design:** Design APIs and user interfaces that encourage responsible use of LLMs, such as integrating content filters, clearly labeling AI-generated content and informing users on limitations of reliability and accuracy. Be specific about the intended field of use limitations.
8. **Training and Education:** Provide comprehensive training for users on the limitations of LLMs, the importance of independent verification of generated content, and the need for critical thinking. In specific contexts, offer domain-specific training to ensure users can effectively evaluate LLM outputs within their field of expertise.

### Example Attack Scenarios

**Scenario #1:** Attackers experiment with popular coding assistants to find commonly hallucinated package names. Once they identify these frequently suggested but nonexistent libraries, they publish malicious packages with those names to widely used repositories. Developers, relying on the coding assistant's suggestions, unknowingly integrate these poised packages into their software. As a result, the attackers gain unauthorized access, inject malicious code, or establish backdoors, leading to significant security breaches and compromising user data.

**Scenario #2:** A company provides a chatbot for medical diagnosis without ensuring sufficient accuracy. The chatbot provides poor information, leading to harmful consequences for patients. As a result, the company is successfully sued for damages. In this case, the safety and security breakdown did not require a malicious attacker but instead arose from the insufficient oversight and reliability of the LLM system. In this scenario, there is no need for an active attacker for the company to be at risk of reputational and financial damage.

### Reference Links

1. [AI Chatbots as Health Information Sources: Misrepresentation of Expertise](https://www.kff.org/health-misinformation-monitor/volume-05/): **KFF**
2. [Air Canada Chatbot Misinformation: What Travellers Should Know](https://www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know): **BBC**
3. [ChatGPT Fake Legal Cases: Generative AI Hallucinations](https://www.legaldive.com/news/chatgpt-fake-legal-cases-generative-ai-hallucinations/651557/): **LegalDive**
Expand All @@ -43,8 +45,9 @@ A related issue is overreliance. Overreliance occurs when users place excessive
9. [How to Reduce the Hallucinations from Large Language Models](https://thenewstack.io/how-to-reduce-the-hallucinations-from-large-language-models/): **The New Stack**
10. [Practical Steps to Reduce Hallucination](https://newsletter.victordibia.com/p/practical-steps-to-reduce-hallucination): **Victor Debia**
11. [A Framework for Exploring the Consequences of AI-Mediated Enterprise Knowledge](https://www.microsoft.com/en-us/research/publication/a-framework-for-exploring-the-consequences-of-ai-mediated-enterprise-knowledge-access-and-identifying-risks-to-workers/): **Microsoft**

### Related Frameworks and Taxonomies

Refer to this section for comprehensive information, scenarios strategies relating to infrastructure deployment, applied environment controls and other best practices.

- [AML.T0048.002 - Societal Harm](https://atlas.mitre.org/techniques/AML.T0048) **MITRE ATLAS**
- [AML.T0048.002 - Societal Harm](https://atlas.mitre.org/techniques/AML.T0048) **MITRE ATLAS**
File renamed without changes.
48 changes: 0 additions & 48 deletions 2_0_vulns/emerging_candidates/BackdoorAttacks.md

This file was deleted.

10 changes: 0 additions & 10 deletions 2_0_vulns/emerging_candidates/README.md

This file was deleted.

Loading

0 comments on commit 56b2e82

Please sign in to comment.