🎉🎉🎉 Welcome to the LLM Security Research Repository, your premier destination for the latest and most comprehensive research on LLM security ! ! !
As the field of artificial intelligence rapidly evolves, so do the security challenges associated with large models. Our repository is at the forefront of this emerging discipline, providing a comprehensive and constantly updated collection of research papers. Our repository stands out for several key reasons:
-
🔥 Cutting-Edge Security Research: Focused on the newest and most innovative security studies targeting large language models, ensuring you stay ahead in this critical area.
-
⏰️ Real-Time Updates: Our repository is updated in real-time, offering the most current and relevant research findings as they become available.
-
📚️ Comprehensive Coverage: We aim to cover all aspects of large model security, from theoretical foundations to practical applications, and our collection will continually expand to include every facet of this essential field.
-
🇨🇳 High-Quality translation Content: Access high-quality articles in Chinese, catering to a global audience and promoting cross-language collaboration.
-
🌟 Big Model Summary: Domain experts collaborate with ChatGPT4 Agent to provide a detailed summary of the paper for a quick understanding of the paper's logic.
Join us in exploring the cutting-edge of AI security and contribute to a safer future for large models. Dive into our extensive and ever-growing collection of research papers today, stay ahead in the rapidly evolving field of AI model security ! ! !
Title | Date | Published | Tag |
---|---|---|---|
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models | 2024.2.12 | arXiv | Attack |
Title | Date | Published | Tag |
---|---|---|---|
Poisoning Retrieval Corpora by Injecting Adversarial Passages | 2023.10.29 | arXiv | Attack |
Corpus Poisoning via Approximate Greedy Gradient Descent | 2024.6.7 | arXiv | Attack |
Title | Date | Published | Tag |
---|---|---|---|
Don’t Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models | 2023.3.26 | USENIX Security 2024 | Experiment Attack |
Many shot Jailbreaking | 2024.4.2 | USENIX Security 2024 | Method Attack |
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models | 2024.5.20 | ICML 2024 | Method Attack |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition | 2024.5.14 | ICML 2024 | Method Defense |
GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis | 2024.5.29 | ACL 2024 | Method Defense |
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding | 2024.8.22 | ACL 2024 | Method Defense |
Multilingual Jailbreak Challenges in Large Language Models | 2024.3.4 | ICML 2024 | Method Defense |
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting | 2024.5.14 | ECCV 2024 | Method Defense |
Title | Date | Published | Tag |
---|---|---|---|
TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models | 2023.12.7 | NeurlPS 2024 | Method Attack |
Title | Date | Published | Tag |
---|---|---|---|
GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis | 2023.2.9 | arXiv | Experiment Attack |
Title | Date | Published | Tag |
---|---|---|---|
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies | 2024.7.28 | arXiv | Survey Attack Defense |
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents | 2024.6.5 | arXiv | Attack |
INJECAGENT: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | 2024.3.24 | arXiv | Attack Prompt-injection |
LLM Agents can Autonomously Exploit One-day Vulnerabilities | 2024.4.11 | arXiv | Application Attack |
Title | Date | Published | Tag |
---|---|---|---|
Studying Large Language Model Behaviors Under Realistic Knowledge Conflicts | 2024.4.24 | arXiv | Experiment RAG |
Adaptive Chameleon or Stubborn Sloth: REVEALING THE BEHAVIOR OF LARGE LANGUAGE MODELS IN KNOWLEDGE CONFLICTS | 2024.7.28 | arXiv | Experiment RAG |
your contributions are always welcome!
If you have any questions about this paper list, or seek academic research in the field of LLM security, please get in touch at [email protected] .