Add artemiskit-cli skill by code-sensei · Pull Request #668 · vercel-labs/skills

code-sensei · 2026-03-16T22:51:25Z

Summary

Adds the artemiskit-cli skill for LLM evaluation and security testing.

ArtemisKit is an open-source CLI toolkit that helps developers:

Test LLM outputs with scenario-based evaluation (YAML-driven quality testing)
Secure LLMs via red teaming (prompt injection, jailbreaks, data extraction, PII disclosure)
Stress test LLM endpoints (latency p50/p95/p99, throughput, token usage, cost estimation)
Compare evaluation runs for regression detection
Generate interactive HTML reports and JSON manifests

Commands

Command	Purpose
`akit run`	Execute scenario-based evaluations
`akit redteam`	Security red team testing
`akit stress`	Load and stress testing
`akit report`	Generate/regenerate reports
`akit history`	View run history
`akit compare`	Compare two evaluation runs
`akit baseline`	Manage baselines for regression testing
`akit validate`	Validate scenario files
`akit init`	Initialize configuration

Provider Support

OpenAI (GPT-4, GPT-4o, etc.)
Anthropic (Claude 3.5, Claude 4, etc.)
Azure OpenAI
Vercel AI SDK
OpenAI-compatible APIs (Ollama, vLLM, LM Studio)

Skill Structure

skills/artemiskit-cli/
├── SKILL.md              # Main skill file
└── references/
    ├── commands.md       # CLI command reference
    ├── providers.md      # Provider configuration
    └── scenarios.md      # Scenario format documentation

Links

Repository: https://github.com/code-sensei/artemiskit
npm package: @artemiskit/cli
Documentation: https://artemiskit.vercel.app

Test Plan

Skill follows SKILL.md format with YAML frontmatter
References use progressive disclosure pattern
All commands verified against CLI implementation (--help output)
Installed and tested via npx skills install code-sensei/artemiskit-cli-skill

ArtemisKit is an open-source LLM evaluation toolkit that provides: - Quality testing with scenario-based evaluation (YAML-driven) - Security red teaming for prompt injection, jailbreaks, data extraction - Stress testing with latency metrics (p50/p95/p99), throughput, costs - Multi-provider support (OpenAI, Anthropic, Azure, Vercel AI SDK) Commands: run, redteam, stress, report, history, compare, baseline, validate, init Repository: https://github.com/code-sensei/artemiskit npm: @artemiskit/cli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add artemiskit-cli skill#668

Add artemiskit-cli skill#668
code-sensei wants to merge 1 commit intovercel-labs:mainfrom
code-sensei:add-artemiskit-cli-skill

code-sensei commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

code-sensei commented Mar 16, 2026

Summary

Commands

Provider Support

Skill Structure

Links

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant