Safety guardrails for
every LLM application

Detect prompt injections in 12 languages, redact PII, block harmful content, scan tool calls, and catch hallucinations. Sub-millisecond latency. Zero heavy dependencies.

11
Built-in Scanners
~0.05ms
Scan Latency
1641
Tests Passing
12
Languages Supported

Try it live

Try the safety scanner and prompt hardening tool — runs entirely in your browser.

Prompt Injection PII Detection Harmful Content Toxicity Tool-Use Safety Poisoned Repo API Key Exfil Leetspeak ROT13 Attack Spanish Injection Chinese Injection Russian Injection Hardcoded Secret Private Key Safe Input

Enterprise-grade safety, developer-friendly API

Prompt Injection Detection

Catches instruction overrides, role injections, delimiter attacks, jailbreaks, prompt leaking, and encoded injection in 12 languages plus cross-lingual attacks.

PII Detection & Redaction

Finds and auto-redacts emails, SSNs, credit cards (Luhn-validated), phone numbers, API keys, GitHub tokens, AWS keys, and passport numbers.

Tool-Use Safety

Scans agentic tool calls for dangerous commands, data exfiltration, credential access, and privilege escalation before they execute.

Harmful Content Filtering

Blocks weapons manufacturing, drug synthesis, self-harm, malware creation, hate speech, phishing, and fraud instructions.

Claude Code Integration

Drop-in PreToolUse hook, MCP server with 7 tools, 5 slash commands, and auto-invoked safety skill. One-command setup with sentinel init.

Adversarial Red-Teaming

Built-in adversarial tester generates evasion variants using homoglyphs, zero-width chars, leetspeak, and more. Test your safety before attackers do.

Multi-Turn Conversation Safety

Track safety across entire conversations. Detects gradual jailbreak escalation, topic persistence, sandwich attacks, and re-attempts after blocks.

Structured Output Validation

Scan LLM-generated JSON for XSS, SQL injection, template injection, and path traversal hidden in field values. Schema validation included.

Obfuscation Detection

Catches encoded attack payloads that bypass keyword filters: base64, hex-encoded commands, ROT13 obfuscation, unicode escapes, and leetspeak variants of dangerous terms.

Code Vulnerability Scanner

Scans LLM-generated code for OWASP Top 10 vulnerabilities: SQL injection, command injection, XSS, hardcoded secrets, insecure deserialization, weak crypto, SSRF, and path traversal.

LLM API Firewall

Transparent reverse proxy for any LLM API (Anthropic, OpenAI). Scans all requests/responses, blocks dangerous content mid-stream, auto-redacts PII, enforces model allowlists. One command: sentinel proxy.

SARIF Output

Generate SARIF v2.1.0 output for GitHub Code Scanning, Azure DevOps, and any static analysis tool. Upload results to the GitHub Security tab with one flag: upload-sarif: true.

Security Audit

sentinel audit scores your project's security configuration (0-100). Checks Claude Code hooks, permissions allowlist, .env files, pre-commit hooks, and MCP config with actionable fix suggestions.

MCP Tool Schema Validator

Validate MCP tool definitions for injection vectors before trusting them. Detects prompt injection, authority impersonation, data exfiltration, and concealment instructions hidden in tool descriptions. sentinel mcp-validate.

Dependency Scanner

Scan requirements.txt, package.json, and pyproject.toml for supply chain attacks. Detects typosquatting, known malicious packages, suspicious URLs, and dangerous install scripts. sentinel dep-scan.

CLAUDE.md Scanner

Scan project instruction files (CLAUDE.md, .cursorrules) for 11 categories of injection vectors: hidden HTML comments, authority impersonation, base URL override, zero-width chars, and more. sentinel claudemd-scan.

Secrets Scanner

Detect hardcoded API keys, tokens, and credentials across 40+ providers: AWS, GitHub, Google, Stripe, Slack, OpenAI, Anthropic, and more. Entropy analysis and smart filtering reduce false positives. sentinel secrets-scan.

Canary Tokens

Plant invisible markers in system prompts. If they appear in model output, your prompt was leaked. Two styles: HTML comment canaries and zero-width Unicode encoding (truly invisible).

Prompt Hardening

Make system prompts injection-resistant with defense-in-depth: XML section tagging, sandwich defense, role lock, instruction priority markers, and input fencing for untrusted data.

Compliance Mapper

Map scan results to EU AI Act, NIST AI RMF, and ISO/IEC 42001 controls. Automated risk classification, per-control compliance status, and remediation recommendations for regulatory reporting.

Agent Safety Monitor

Track tool-use patterns across agentic sessions. Detects destructive commands, data exfiltration, credential access, runaway loops, write spikes, and read-then-exfiltrate attack chains in real-time.

SDK Guard Wrappers

One-line integration for Anthropic and OpenAI SDKs. guard_anthropic(client) wraps your client with automatic input/output scanning. Raises on injection attacks, PII leaks, or unsafe content.

Threat Intelligence Feed

MITRE ATLAS-aligned database of 27+ known LLM attack techniques. Query by category, severity, or tags. Match text against patterns. Covers injection, jailbreak, exfiltration, evasion, and more.

Attack Chain Detector

Detect multi-step attack sequences across tool calls. No single action is dangerous, but the sequence reveals malicious intent: recon → credential access → exfiltration, escalation → destruction, and more.

Session Guard

One-line safety for agentic AI. Combines audit logging, attack chain detection, and threat intelligence into a single guard.check() call. Configurable block threshold, custom rules, and SIEM-ready JSON export.

Session Audit Trail

Tamper-evident logging with SHA-256 hash chaining. Every tool call, blocked action, and anomaly is recorded. Export to JSON for Splunk, Datadog, Elastic. SOC 2 and ISO 27001 compliance ready.

Enterprise Features

Policy engine (YAML), webhooks (Slack, PagerDuty), OpenTelemetry, API key auth, rate limiting, streaming protection, RSP-aligned risk reports.

Works with your stack

Drop-in middleware for popular LLM SDKs and frameworks

Security Research

Why Your LLM Safety Layer Shouldn't Be Another LLM

Model-based guardrails add 100-500ms per scan. Here's how regex-based detection achieves sub-millisecond latency with 100% accuracy on known patterns.

Read article →

5 Ways Your CLAUDE.md Can Be Weaponized

CLAUDE.md files are powerful configuration for AI assistants — but they're also attack vectors. Hidden comments, authority impersonation, and more.

Read article →

Multi-Turn Attacks That Single-Message Scanning Misses

Split injection, progressive jailbreaks, and context manipulation — attacks that span conversation turns. How to detect them in real-time.

Read article →

When AI Agents Go Rogue: Behavioral Anomaly Detection

AI agents can delete files, exfiltrate data, and escalate privileges. How to detect anomalous behavior patterns in real-time with tool-call monitoring.

Read article →

Building a Threat Intelligence Feed for LLM Attacks

A MITRE ATLAS-aligned database of 27+ known attack techniques. Query by category, match text against patterns, and integrate with your safety pipeline.

Read article →

One-Line Safety for Agentic AI: Introducing SessionGuard

Drop-in safety layer combining audit logging, attack chain detection, and threat intelligence. One guard.check() call gives you real-time protection and SIEM export.

Read article →

Turning CLAUDE.md Rules into Deterministic Guardrails

CLAUDE.md rules are suggestions the model can ignore. ClaudeMdEnforcer converts them into deterministic checks that actually block violations before they execute.

Read article →

Scanning AI-Generated Code for Vulnerabilities Before It Hits Disk

OWASP Top 10 detection and supply chain attack prevention. Catch SQL injection, hardcoded secrets, and malicious packages in real-time as Claude Code writes.

Read article →

Start protecting your LLM apps today

Install in seconds. No API key required. No GPU needed.

pip install git+https://github.com/MaxwellCalkin/sentinel-ai.git

JavaScript/TypeScript: npm install github:MaxwellCalkin/sentinel-ai#main