From 1df32c4fb5d94f08ce683e95704ecc578b5862ea Mon Sep 17 00:00:00 2001 From: Aryaman Behera <56792979+aryaman-titan@users.noreply.github.com> Date: Sun, 9 Jun 2024 02:57:49 +0530 Subject: [PATCH] Update KrishnaSankar_FineTuningRag.md - add examples of RAG poisoning (#360) --- 2_0_candidates/KrishnaSankar_FineTuningRag.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/2_0_candidates/KrishnaSankar_FineTuningRag.md b/2_0_candidates/KrishnaSankar_FineTuningRag.md index 8fa3a5f9..9296e841 100644 --- a/2_0_candidates/KrishnaSankar_FineTuningRag.md +++ b/2_0_candidates/KrishnaSankar_FineTuningRag.md @@ -21,13 +21,14 @@ The risks and vulnerabilities range from breaking safety and alignment to outdat 6. RAG can bypass access controls - data from different disparate sources might find their way into a central vector db and a query might traverse all of them without regard to the access restrictions 7. RAG- outdated data/data obsolescence risk - this is more pronounced in customer service, operating procedures and so forth. Usually people update documents and they upload to a common place for others to refer to. With RAG and VectorDB, it is not that simple - documents need to be validated, added to the embedding pipeline and follow from there. Then the system needs to be tested as a new document might trigger some unknown response from an LLM. (See Knowledge mediated Risk) 8. RAG Data parameter risk - when documents are updated they might make the RAG parameters like chunk size obsolete. For example a fare table might add more tiers making the table longer, thus the original chunking becomes obsolete. +9. Jailbreak through RAG poisoning - adversaries can inject malicious trigger payloads into dynamic knowledge-base that gets updated daily (like slack chat records, Github PR etc.). This can not just break the safety alignment leading LLM to blurt out harmful responses, but also disrupt the intended functionality of the application, in some cases making the rest of the knowledge-base redundant. (Ref #3) ### Prevention and Mitigation Strategies 1. There should be processes in place to improve the quality and concurrency of RAG knowledge sources 2. A mature end-to-end access control strategy that takes into account the RAG pipeline stages -3. Reevaluate safety and security alignment after fine tuning and RAG +3. Reevaluate safety and security alignment after fine tuning and RAG, through red teaming efforts. 4. When combining data from different sources, do a thorough review of the combined dataset in the VectorDb 5. Have fine grained access control at the VectorDb level or have granular partition and appropriate visibility 6. For fine tuning and RAG validate all documents and data for hidden codes, data poisoning et al @@ -39,6 +40,8 @@ Scenario #1: Resume Data Poisoning Scenario #2: Access control risk by combining data with different access restrictions in a vector db +Scenario #3: Allowing UGC (user-generated content) in comment section of a webpage poisons the overall knowledge-base (Ref #3), over which the RAG is running, leading to compromise in integrity of the application. + ### Reference Links @@ -47,4 +50,4 @@ Scenario #2: Access control risk by combining data with different access restric 3. [How RAG Poisoning Made Llama3 Racist!](https://blog.repello.ai/how-rag-poisoning-made-llama3-racist-1c5e390dd564) 4. [Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!](https://openreview.net/forum?id=hTEGyKf0dZ) 5. [How RAG Architecture Overcomes LLM Limitations](https://thenewstack.io/how-rag-architecture-overcomes-llm-limitations/) -6. [What are the risks of RAG applications?](https://www.robustintelligence.com/solutions/rag-security) \ No newline at end of file +6. [What are the risks of RAG applications?](https://www.robustintelligence.com/solutions/rag-security)