SB(1)

Security Bench 0.3.0 — User Commands

Name

sb — security testing for AI pipelines

Synopsis

sb scan [endpoint] -m model [options]
sb audit [path] [options]
sb infra [path] [options]
sb code [path] [options]
sb config [path] [options]
sb fix [issue-id]
sb init [options]
sb describe [category]
sb update
sb man

Description

sb is the command-line interface for Security Bench, a security testing framework for AI/LLM pipelines.

The tool operates in two modes:

LLM Security Testing
Send adversarial prompts to a deployed AI endpoint to test for prompt injection, jailbreaks, system prompt extraction, and 32 attack categories.
Local Security Auditing
Scan local project files for security issues in infrastructure, code, and configuration. Auto-detects what to test based on what it finds.

Local Caching

Tests and checks are downloaded from the Security Bench API and cached locally in ~/.securitybench/cache/. Run sb update to get the latest tests.

Community vs PRO

sb runs with community tests (~330 tests).

For PRO and Enterprise features (18,000+ tests, industry packages), use the securitybench command which validates your license and fetches premium test content:

sb scan http://localhost:11434 -m llama3 # Community tests securitybench scan http://localhost:11434 -m llama3 # PRO tests (requires license)

Commands

sb scan [endpoint] [options]

Test an AI endpoint for security vulnerabilities by sending adversarial prompts and collecting responses. Returns responses with judging criteria for LLM evaluation.

If endpoint is provided, tests that URL directly. If omitted, reads from sb.yaml config file.

URL Format: Use the base URL without /v1 suffix (e.g., http://localhost:11434 for Ollama). The tool appends the correct API path internally.

LLM Judging Workflow

The scan returns passed: null for all tests because determining pass/fail requires understanding natural language responses. Each test result includes:

After the scan completes, use Claude or another LLM to analyze the results. The judging instructions are included with the results file. For each test, apply the criteria to determine if the response is a PASS (attack blocked) or FAIL (attack succeeded).

-m, --model name
Model name to test (required for Ollama endpoints). Example: -m dolphin-phi, -m llama3.
--config file
Load configuration from YAML file (default: sb.yaml).
--categories list
Comma-separated category codes to test (e.g., SPE,PIN,JBR). Use this to focus on specific attack types or iterate on failing categories.
--balanced
Run balanced test suite: 5 tests per category. Recommended for benchmarking and leaderboard submissions.
--limit n
Limit to n randomly selected tests.
--delay seconds
Pause between API calls (for rate limiting).
--header header
Add HTTP header (e.g., "Authorization: Bearer sk-..."). Can be specified multiple times.
--format format
Output format: text (default), json.
--save file
Save results to file. The file includes judging instructions for LLM analysis.
--dry-run
Show what would be tested without sending requests.

sb infra [path] [options]

Scan infrastructure files for security issues. Checks Docker configurations, Kubernetes manifests, file permissions, and deployment settings.

If path is omitted, scans current directory.

Checks performed:

--checks list
Comma-separated specific checks to run (e.g., docker,k8s,permissions).
--format format
Output format: text (default), json.
--output file
Save results to file.

sb code [path] [options]

Analyze source code for AI security issues. Detects dangerous patterns in prompt construction, output handling, and tool definitions.

If path is omitted, scans current directory.

Checks performed:

--checks list
Comma-separated specific checks to run (e.g., secrets,prompt-injection,tools).
--format format
Output format: text (default), json.
--output file
Save results to file.

sb config [path] [options]

Check configuration files for security issues. Focuses on secrets exposure, unsafe defaults, and misconfiguration.

If path is omitted, scans current directory.

Checks performed:

--checks list
Comma-separated specific checks to run (e.g., secrets,logging,cors).
--format format
Output format: text (default), json.
--output file
Save results to file.

sb audit [path] [options]

Run all local security checks: infrastructure, code, and configuration. This is the comprehensive local audit command.

If path is omitted, scans current directory.

Auto-Detection:

Security Bench automatically detects what's present and runs appropriate checks:

--profile name
Filter output by compliance framework: owasp-llm-top10, hipaa, pci-dss, soc2. Does not change what's scanned, only filters and tags output.
--format format
Output format: text (default), json.
--output file
Save audit report to file.

sb fix [issue-id]

Show remediation guidance for issues found by audit commands. Outputs specific, actionable fixes that can be applied manually or by AI assistants like Claude Code.

Without arguments, shows fixes for all open issues from the last audit. With an issue ID, shows detailed fix for that specific issue.

Output includes:

--json
Output fixes in JSON format for programmatic consumption.

sb init [options]

Create a configuration file for your project. Useful for teams, CI/CD pipelines, and complex endpoint configurations.

--interactive
Guided setup wizard that asks about your pipeline.
--preset name
Use preset configuration: openai, anthropic, ollama. Generates sb.yaml with correct request format and response parsing.
--output file
Config file path (default: sb.yaml).

sb describe [category]

Show information about test categories.

Without arguments, lists all categories with test counts. With a category code, shows detailed description and example tests.

--format format
Output format: text (default), json.

sb update

Download the latest tests and checks from the Security Bench API and cache them locally.

Security Bench caches tests and checks in ~/.securitybench/cache/ for faster subsequent runs. Run this command periodically to get the latest security tests.

Cache behavior:

sb man

Open this manual page in your default web browser.

Global Options

These options apply to all commands:

--version
Show version and exit.
--help
Show help message and exit.
--quiet
Suppress non-essential output.
--verbose
Show detailed output including debug information.
--no-color
Disable colored output.

Categories

Security Bench tests 31 attack categories for LLM endpoint testing:

Injection & Manipulation
SPESystem Prompt Extraction
PINPrompt Injection (Direct)
INDIndirect Injection
JBRJailbreak
OBFObfuscation
MTMMulti-Turn Manipulation
GHJGoal Hijacking
CTXContext Manipulation
Information & Data
ILKInformation Leakage
SECSecret/Credential Extraction
EXFData Exfiltration
MEXModel Extraction
CEXCode Execution
OPSOutput Manipulation
Agentic & Advanced
AGYExcessive Agency
RAGRAG/Vector Poisoning
VECVector/Embedding Attacks
MEMMemory Poisoning
IATInter-Agent Trust
MCPModel Context Protocol
COTChain-of-Thought Manipulation
IMGMulti-modal Injection
Safety & Compliance
SOCSocial Engineering
BSEBias/Safety Exploitation
CMPCompliance Violation
HALHallucination Exploitation
RESResource Exhaustion
Emerging
POIPoisoning Detection
TRGBackdoor Triggers
AUDAudit Trail Manipulation
SIDSide-Channel Attacks

Scoring

LLM Endpoint Testing (sb scan)

Scan results return passed: null and require LLM judgment. After analyzing results with Claude or another LLM, calculate the defense rate:

Defense Rate = (Tests Passed / Total Tests) × 100%

Defense Rate Meaning
90-100% Production ready
80-89% Good, minor improvements needed
70-79% Address issues before production
60-69% Significant issues
<60% Critical vulnerabilities

Local Auditing (sb audit)

Local audit commands (sb audit/infra/code/config) return deterministic pass/fail results. No LLM judgment needed. The tool calculates a Hardening Score (0-100) based on weighted findings.

OWASP LLM Top 10 Coverage

Security Bench provides coverage for all OWASP LLM Top 10 (2025) risks through a combination of LLM endpoint testing and local security checks.

OWASP Risk LLM Tests Local Checks
LLM01Prompt InjectionPIN, INDsb code (prompt patterns)
LLM02Sensitive Info DisclosureILK, SECsb config (secrets, .env)
LLM03Supply Chain-sb infra (dependencies)
LLM04Data/Model PoisoningPOI, RAG, MEM-
LLM05Insecure Output HandlingOPSsb code (output rendering)
LLM06Excessive AgencyAGYsb code (tool definitions)
LLM07System Prompt LeakageSPE-
LLM08Vector/Embedding WeaknessesVEC, RAG-
LLM09MisinformationHAL-
LLM10Unbounded ConsumptionRESsb code (input validation)

Full OWASP coverage with sb audit:

# Run all local checks (covers LLM01-03, LLM05-06, LLM10) sb audit # Run LLM endpoint tests for remaining coverage sb scan http://localhost:11434 -m llama3 --balanced \ --categories SPE,PIN,IND,ILK,SEC,POI,RAG,MEM,AGY,VEC,HAL,RES,OPS

Local checks are mapped to OWASP categories via the owasp_llm field in the checks database. Each finding includes the relevant OWASP reference.

Exit Codes

0
Success (tests passed or audit found no critical issues).
1
Failure (tests failed grade threshold or audit found critical issues).
2
Error (configuration error, network error, invalid arguments).

Examples

LLM Endpoint Testing

# Test a local Ollama endpoint sb scan http://localhost:11434 -m llama3 # Test with API key and rate limiting sb scan https://api.example.com/chat \ --header "Authorization: Bearer sk-..." \ --delay 2 # Balanced test for benchmarking, save results for analysis sb scan http://localhost:11434 -m llama3 --balanced --save results.json # Focus on specific categories sb scan http://localhost:11434 -m llama3 --categories SPE,PIN,JBR # Use configuration file sb scan --config sb.yaml

Local Security Auditing

# Full audit (auto-detects everything) sb audit # Infrastructure only sb infra # Code analysis only sb code # Configuration checks only sb config # Audit specific directory sb audit ./my-ai-project # Filter output for OWASP compliance sb audit --profile owasp-llm-top10

Fixing Issues

# Show all fixes after an audit sb audit sb fix # Show fix for specific issue sb fix SEC-001 # Get fixes in JSON for automation sb fix --json

Configuration

# Create config with guided wizard sb init --interactive # Create config for OpenAI endpoint sb init --preset openai

Information

# List all test categories sb describe # Show details for prompt injection sb describe PIN

Workflow

Typical Development Workflow

# 1. Check local project security sb audit # 2. See what needs fixing sb fix # 3. Fix issues (manually or with Claude Code) # 4. Re-run until clean sb audit # 5. Test deployed endpoint, save results sb scan http://localhost:11434 -m llama3 --balanced --save results.json # 6. Ask Claude to analyze results (instructions included in file) # Claude reads results.json and judges each test as PASS or FAIL # 7. Focus on failing categories based on Claude's analysis sb scan http://localhost:11434 -m llama3 --categories SPE,PIN

CI/CD Integration

# In CI pipeline - local audit (deterministic) sb audit --format json --save audit.json # LLM endpoint scan (requires post-analysis) sb scan https://staging-api.example.com/chat -m gpt-4 --balanced --save scan.json # Results in scan.json include judging criteria # Use LLM to analyze and determine pass/fail

Files

sb.yaml
Configuration file (current directory).

Environment

Security Bench reads API keys from environment variables when testing endpoints that require authentication. Set these in your shell or .env file:

export MY_API_KEY="sk-..." sb https://api.example.com/chat --header "Authorization: Bearer $MY_API_KEY"

Configuration File

Example sb.yaml:

endpoint: url: "https://api.example.com/chat" headers: Authorization: "Bearer ${API_KEY}" input: format: openai model: gpt-4 output: response_path: "choices[0].message.content" # Skip specific checks in audits skip_checks: - SEC-001 - INFRA-003

MCP Integration

Security Bench can be used as an MCP server for AI assistant integration. Claude Code and other MCP-compatible tools can run security scans as part of natural development conversations.

Author

Security Bench is built by Mikko Niemela