Ads llm 03 data poisoning v2 renaming 251 (#270)

* feat: kickoff v2 0 dir and files * docs: init v2 training data llm03 naming change
OWASP · Feb 20, 2024 · ced8198 · ced8198
1 parent 0e22276
commit ced8198
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/2_0_vulns/LLM03_TrainingDataPoisoning.md → 2_0_vulns/LLM03_DataModelPoisoning.md b/2_0_vulns/LLM03_TrainingDataPoisoning.md → 2_0_vulns/LLM03_DataModelPoisoning.md
@@ -1,10 +1,10 @@
-## LLM03: Training Data Poisoning
+## LLM03: Data and Model Poisoning
 
 ### Description
 
 The starting point of any machine learning approach is training data, simply “raw text”. To be highly capable (e.g., have linguistic and world knowledge), this text should span a broad range of domains, genres and languages. A large language model uses deep neural networks to generate outputs based on patterns learned from training data.
 
-Training data poisoning refers to manipulation of pre-training data or data involved within the fine-tuning or embedding processes to introduce vulnerabilities (which all have unique and sometimes shared attack vectors), backdoors or biases that could compromise the model’s security, effectiveness or ethical behavior. Poisoned information may be surfaced to users or create other risks like performance degradation, downstream software exploitation and reputational damage. Even if users distrust the problematic AI output, the risks remain, including impaired model capabilities and potential harm to brand reputation.
+Data and Model Poisoning refers to manipulation of pre-training data or data involved within the fine-tuning or embedding processes to introduce vulnerabilities (which all have unique and sometimes shared attack vectors), backdoors or biases that could compromise the model’s security, effectiveness or ethical behavior. Poisoned information may be surfaced to users or create other risks like performance degradation, downstream software exploitation and reputational damage. Even if users distrust the problematic AI output, the risks remain, including impaired model capabilities and potential harm to brand reputation.
 
 - Pre-training data refers to the process of training a model based on a task or dataset.
 - Fine-tuning involves taking an existing model that has already been trained and adapting it to a narrower subject or a more focused goal by training it using a curated dataset. This dataset typically includes examples of inputs and corresponding desired outputs.