llm-security

This repository contains the code to test various security issues with LLM prompts. It has detectors to detect these issues. It shows how to invoke these detectors independantly and via LangChain.

Following detectors are available:

Google Cloud Armor product for content safety and prompt injection detection
Prompt injection using '<aprotectai/deberta-v3-base-prompt-injection-v2'
Various other issues like hate speech, profanity, harassement, firearms and weapons, public safety etc.

The detectors and LLMs are invoked via Langchain. LangSmith is used for visualization and troubleshooting.

Features

Google Cloud Model Armor for a comprehensive prompt attacks. (Feature in Private preview, not yet GA)
Huggingface DeBerta model for prompt injection detection.
LangChain and LangSmith integration to show the flow of the verifications

Models

Huggingface
Google Gemini
OpenAI GPT
Anthropic Claude

Setup

Requirements

Need a Google Cloud project that is enabled for Model Armor
Store API keys as a secrets in Google Cloud Secrets Manager

Service account

Create a service account in this Google Cloud project
Assign following roles at project level
- 'Secret Manager Secret Accessor' (this role can be assigned at secret level as well for tigher security control)
- 'Model Armor Admin' (Your project needs to be allow listed to use this preview feature, for now)

Secrets

Create following secrets in the Google Cloud project:

anthropic_api_key
google_api_key
huggingface_api_key
langsmith_api_key
openai_api_key

Input

Provide a local.env file containing following key-value pairs
- SA_KEY_PATH=<path-to-key-file>
- SA_KEY_NAME=sa-key.json
- GOOGLE_CLOUD_PROJECT=<project-id>
- MA_TEMPLATE_NAME=<template-name>
- REGION=us-central1
- PROMPT_CSV=<path>/prompts_and_labels.csv
- LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
- LANGCHAIN_TRACING_V2=true
- LANGCHAIN_PROJECT=llm-security
- TOKENIZERS_PARALLELISM="false"

Output

Provides a command line output for various prompt detection
Generates a CSV file with summary.

Critical libraries

Python==3.11.5
transformers==4.40.0
tensorflow==2.15.0
keras==2.15.0
torch==2.4.1
langsmith==0.1.117
langchain==0.2.16

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
detector		detector
langchain		langchain
modelarmor		modelarmor
models		models
security		security
util		util
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
cli_client.py		cli_client.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-security

Features

Models

Setup

Requirements

Service account

Secrets

Input

Output

Critical libraries

About

Releases

Packages

Languages

prashantkul/llm-security

Folders and files

Latest commit

History

Repository files navigation

llm-security

Features

Models

Setup

Requirements

Service account

Secrets

Input

Output

Critical libraries

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages