Skip to content

Commit

Permalink
Added more details and references (#414)
Browse files Browse the repository at this point in the history
* Added more details and references

Signed-off-by: Krishna Sankar <[email protected]>

* Update 2_0_vulns/emerging_candidates/RetrievalAugmentedGeneration.md

Co-authored-by: Ads Dawson <[email protected]>
Signed-off-by: Krishna Sankar <[email protected]>

* Update 2_0_vulns/emerging_candidates/RetrievalAugmentedGeneration.md

Co-authored-by: Ads Dawson <[email protected]>
Signed-off-by: Krishna Sankar <[email protected]>

* Update 2_0_vulns/emerging_candidates/RetrievalAugmentedGeneration.md

Co-authored-by: Ads Dawson <[email protected]>
Signed-off-by: Krishna Sankar <[email protected]>

* Update 2_0_vulns/emerging_candidates/RetrievalAugmentedGeneration.md

Signed-off-by: Ads Dawson <[email protected]>

* Apply suggestions from code review

Signed-off-by: Ads Dawson <[email protected]>

---------

Signed-off-by: Krishna Sankar <[email protected]>
Signed-off-by: Ads Dawson <[email protected]>
Co-authored-by: Ads Dawson <[email protected]>
  • Loading branch information
xsankar and GangGreenTemperTatum authored Sep 30, 2024
1 parent 9318e79 commit 4171a24
Showing 1 changed file with 23 additions and 3 deletions.
26 changes: 23 additions & 3 deletions 2_0_vulns/emerging_candidates/RetrievalAugmentedGeneration.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Moreover, malicious actors could manipulate the external knowledge base by injec
11. **RAG Data parameter risk:** When documents are updated they might make previous RAG parameters like chunk size obsolete. For example a fare table might add more tiers making the table longer, thus the original chunking becomes obsolete.
12. **Complexity:** RAG is computationally less intensive, but as a technology it is not easier than fine tuning. Mechanisms like chunking, embedding, and index are still an art, not science. There are many different RAG patterns such as Graph RAG, Self Reflection RAG, and many other emerging RAG patterns. So, technically it is much harder than fine tuning
13. **Legal and Compliance Risks:** Unauthorized use of copyrighted material or non-compliance with data usage policies, for augmentation, can lead to legal repercussions.
14. **RAG based worm escalates RAG membership inference attacks:** Attackers can escalate RAG membership inference attacks and RAG entity extraction attacks to RAG documents extraction attacks, forcing a more severe outcome compared to existing attacks (Ref #13)

While RAG is the focus of this entry, we will mention two vulnerabilities with another adaptation technique Fine tuning.
1. Fine Tuning LLMs may break their safety and security alignment (Ref #2)
Expand All @@ -48,7 +49,7 @@ Information Classification: Tag and classify data within the knowledge base to c
8. Access Control : A mature end-to-end access control strategy that takes into account the RAG pipeline stages. Implement strict access permissions to sensitive data and ensure that the retrieval component respects these controls.
9. Fine grained access control : Have fine grained access control at the VectorDb level or have granular partition and appropriate visibility.
10. Audit access control : Regularly audit and update access control mechanisms
11. Contextual Filtering: Implement filters that detect and block attempts to access sensitive data.
11. Contextual Filtering: Implement filters that detect and block attempts to access sensitive data. For example implement guardrails via structured, session-specific tags [Ref #12]
12. Output Monitoring: Use automated tools to detect and redact sensitive information from outputs
13. Model Alignment Drift detection : Reevaluate safety and security alignment after fine tuning and RAG, through red teaming efforts.
14. Encryption : Use encryption that still supports nearest neighbor search to protect vectors from inversion and inference attacks. Use separate keys per partition to protect against cross-partition leakage
Expand All @@ -59,7 +60,12 @@ Information Classification: Tag and classify data within the knowledge base to c
19. Fallback Mechanisms: Develop strategies for the model to handle situations when the retrieval component fails or returns insufficient data.
20. Regular Security Assessments: Perform penetration testing and code audits to identify and remediate vulnerabilities.
21. Incident Response Plan: Develop and maintain a plan to respond promptly to security incidents.

22. The RAG-based worm attack(Ref #13) has a set of mitigations, that are also good security practices. They include :
1. Database Access Control - Restrict the insertion of new documents to documents created by trusted parties and authorized entities
2. API Throttling - Restrict a user’s number of probes to the system by limiting the number of queries a user can perform to a GenAI-powered application (and to the database used by the RAG)
3. Thresholding - Restrict the data extracted in the retrieval by setting a minimum threshold to the similarity score, limiting the retrieval to relevant documents that crossed a threshold.
4. Content Size Limit - This guardrail intends to restrict the length of user inputs.
5. Automatic Input/Output Data Sanitization - Training dedicated classifiers to identify risky inputs and outputs incl adversarial selfreplicating prompt

### Example Attack Scenarios

Expand Down Expand Up @@ -94,4 +100,18 @@ Information Classification: Tag and classify data within the knowledge base to c
8. [Information Leakage in Embedding Models](https://arxiv.org/abs/2004.00053)
9. [Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence](https://arxiv.org/pdf/2305.03010)
10. [Universal and Transferable Adversarial Attacks on Aligned Language Models](https://llm-attacks.org/)
11. https://www.maginative.com/article/rlhf-in-the-spotlight-problems-and-limitations-with-a-key-ai-alignment-technique/)
11. [RLHF In the Spotlight: Problems and Limitations with Key AI Alignment Technique](https://www.maginative.com/article/rlhf-in-the-spotlight-problems-and-limitations-with-a-key-ai-alignment-technique/)
12. [AWS: Use guardrails](https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/best-practices.html#guardrails)
13. [Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking](https://arxiv.org/abs/2409.08045)
10. [Universal and Transferable Adversarial Attacks on Aligned Language Models](https://llm-attacks.org/)
11. [RLHF In the Spotlight: Problems and Limitations with Key AI Alignment Technique](https://www.maginative.com/article/rlhf-in-the-spotlight-problems-and-limitations-with-a-key-ai-alignment-technique/)
12. [AWS: Use guardrails](https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/best-practices.html#guardrails)
13. [Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking](https://arxiv.org/abs/2409.08045)
10. [Universal and Transferable Adversarial Attacks on Aligned Language Models](https://llm-attacks.org/)
11. [RLHF In the Spotlight: Problems and Limitations with Key AI Alignment Technique](https://www.maginative.com/article/rlhf-in-the-spotlight-problems-and-limitations-with-a-key-ai-alignment-technique/)
12. [AWS: Use guardrails](https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/best-practices.html#guardrails)
13. [Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking](https://arxiv.org/abs/2409.08045)
10. [Universal and Transferable Adversarial Attacks on Aligned Language Models](https://llm-attacks.org/)
[RLHF In the Spotlight: Problems and Limitations with Key AI Alignment Technique](https://www.maginative.com/article/rlhf-in-the-spotlight-problems-and-limitations-with-a-key-ai-alignment-technique/)
[AWS: Use guardrails](https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/best-practices.html#guardrails)
13. [Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking](https://arxiv.org/abs/2409.08045)

0 comments on commit 4171a24

Please sign in to comment.