Skip to content

liuyaojialiuyaojia/Awesome-LLM-Security-Paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome LLM Security Paper


Academic Alpaca

Languages: English | 中文

🎉🎉🎉 Welcome to the LLM Security Research Repository, your premier destination for the latest and most comprehensive research on LLM security ! ! !

As the field of artificial intelligence rapidly evolves, so do the security challenges associated with large models. Our repository is at the forefront of this emerging discipline, providing a comprehensive and constantly updated collection of research papers. Our repository stands out for several key reasons:

  • 🔥 Cutting-Edge Security Research: Focused on the newest and most innovative security studies targeting large language models, ensuring you stay ahead in this critical area.

  • ⏰️ Real-Time Updates: Our repository is updated in real-time, offering the most current and relevant research findings as they become available.

  • 📚️ Comprehensive Coverage: We aim to cover all aspects of large model security, from theoretical foundations to practical applications, and our collection will continually expand to include every facet of this essential field.

  • 🇨🇳 High-Quality translation Content: Access high-quality articles in Chinese, catering to a global audience and promoting cross-language collaboration.

  • 🌟 Big Model Summary: Domain experts collaborate with ChatGPT4 Agent to provide a detailed summary of the paper for a quick understanding of the paper's logic.

Join us in exploring the cutting-edge of AI security and contribute to a safer future for large models. Dive into our extensive and ever-growing collection of research papers today, stay ahead in the rapidly evolving field of AI model security ! ! !

Prompt injection

Title Date Published Tag
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models 2024.2.12 arXiv Attack

Database poisoning

Title Date Published Tag
Poisoning Retrieval Corpora by Injecting Adversarial Passages 2023.10.29 arXiv Attack
Corpus Poisoning via Approximate Greedy Gradient Descent 2024.6.7 arXiv Attack

Jailbreaking

Title Date Published Tag
Don’t Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models 2023.3.26 USENIX Security 2024 Experiment Attack
Many shot Jailbreaking 2024.4.2 USENIX Security 2024 Method Attack
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models 2024.5.20 ICML 2024 Method Attack
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition 2024.5.14 ICML 2024 Method Defense
GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis 2024.5.29 ACL 2024 Method Defense
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding 2024.8.22 ACL 2024 Method Defense
Multilingual Jailbreak Challenges in Large Language Models 2024.3.4 ICML 2024 Method Defense
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting 2024.5.14 ECCV 2024 Method Defense

Backdoor Attack

Title Date Published Tag
TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models 2023.12.7 NeurlPS 2024 Method Attack

Data extraction & privacy

Title Date Published Tag
GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis 2023.2.9 arXiv Experiment Attack

Agent

Title Date Published Tag
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies 2024.7.28 arXiv Survey Attack Defense
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents 2024.6.5 arXiv Attack
INJECAGENT: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents 2024.3.24 arXiv Attack Prompt-injection
LLM Agents can Autonomously Exploit One-day Vulnerabilities 2024.4.11 arXiv Application Attack

Model features

Title Date Published Tag
Studying Large Language Model Behaviors Under Realistic Knowledge Conflicts 2024.4.24 arXiv Experiment RAG
Adaptive Chameleon or Stubborn Sloth: REVEALING THE BEHAVIOR OF LARGE LANGUAGE MODELS IN KNOWLEDGE CONFLICTS 2024.7.28 arXiv Experiment RAG

Contributing

your contributions are always welcome!

If you have any questions about this paper list, or seek academic research in the field of LLM security, please get in touch at [email protected] .

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •