Sentinel AI - Safety Guardrails for LLM Applications

Prompt Injection Detection

Catches instruction overrides, role injections, delimiter attacks, jailbreaks, prompt leaking, and encoded injection in 12 languages plus cross-lingual attacks.

PII Detection & Redaction

Finds and auto-redacts emails, SSNs, credit cards (Luhn-validated), phone numbers, API keys, GitHub tokens, AWS keys, and passport numbers.

Tool-Use Safety

Scans agentic tool calls for dangerous commands, data exfiltration, credential access, and privilege escalation before they execute.

Harmful Content Filtering

Blocks weapons manufacturing, drug synthesis, self-harm, malware creation, hate speech, phishing, and fraud instructions.

Claude Code Integration

Drop-in PreToolUse hook, MCP server with 7 tools, 5 slash commands, and auto-invoked safety skill. One-command setup with sentinel init.

Adversarial Red-Teaming

Built-in adversarial tester generates evasion variants using homoglyphs, zero-width chars, leetspeak, and more. Test your safety before attackers do.

Multi-Turn Conversation Safety

Track safety across entire conversations. Detects gradual jailbreak escalation, topic persistence, sandwich attacks, and re-attempts after blocks.

Structured Output Validation

Scan LLM-generated JSON for XSS, SQL injection, template injection, and path traversal hidden in field values. Schema validation included.

Obfuscation Detection

Catches encoded attack payloads that bypass keyword filters: base64, hex-encoded commands, ROT13 obfuscation, unicode escapes, and leetspeak variants of dangerous terms.

Code Vulnerability Scanner

Scans LLM-generated code for OWASP Top 10 vulnerabilities: SQL injection, command injection, XSS, hardcoded secrets, insecure deserialization, weak crypto, SSRF, and path traversal.

LLM API Firewall

Transparent reverse proxy for any LLM API (Anthropic, OpenAI). Scans all requests/responses, blocks dangerous content mid-stream, auto-redacts PII, enforces model allowlists. One command: sentinel proxy.

SARIF Output

Generate SARIF v2.1.0 output for GitHub Code Scanning, Azure DevOps, and any static analysis tool. Upload results to the GitHub Security tab with one flag: upload-sarif: true.

Security Audit

sentinel audit scores your project's security configuration (0-100). Checks Claude Code hooks, permissions allowlist, .env files, pre-commit hooks, and MCP config with actionable fix suggestions.

MCP Tool Schema Validator

Validate MCP tool definitions for injection vectors before trusting them. Detects prompt injection, authority impersonation, data exfiltration, and concealment instructions hidden in tool descriptions. sentinel mcp-validate.

Dependency Scanner

Scan requirements.txt, package.json, and pyproject.toml for supply chain attacks. Detects typosquatting, known malicious packages, suspicious URLs, and dangerous install scripts. sentinel dep-scan.

CLAUDE.md Scanner

Scan project instruction files (CLAUDE.md, .cursorrules) for 11 categories of injection vectors: hidden HTML comments, authority impersonation, base URL override, zero-width chars, and more. sentinel claudemd-scan.

Secrets Scanner

Detect hardcoded API keys, tokens, and credentials across 40+ providers: AWS, GitHub, Google, Stripe, Slack, OpenAI, Anthropic, and more. Entropy analysis and smart filtering reduce false positives. sentinel secrets-scan.

Canary Tokens

Plant invisible markers in system prompts. If they appear in model output, your prompt was leaked. Two styles: HTML comment canaries and zero-width Unicode encoding (truly invisible).

Prompt Hardening

Make system prompts injection-resistant with defense-in-depth: XML section tagging, sandwich defense, role lock, instruction priority markers, and input fencing for untrusted data.

Compliance Mapper

Map scan results to EU AI Act, NIST AI RMF, and ISO/IEC 42001 controls. Automated risk classification, per-control compliance status, and remediation recommendations for regulatory reporting.

Agent Safety Monitor

Track tool-use patterns across agentic sessions. Detects destructive commands, data exfiltration, credential access, runaway loops, write spikes, and read-then-exfiltrate attack chains in real-time.

SDK Guard Wrappers

One-line integration for Anthropic and OpenAI SDKs. guard_anthropic(client) wraps your client with automatic input/output scanning. Raises on injection attacks, PII leaks, or unsafe content.

Threat Intelligence Feed

MITRE ATLAS-aligned database of 27+ known LLM attack techniques. Query by category, severity, or tags. Match text against patterns. Covers injection, jailbreak, exfiltration, evasion, and more.

Attack Chain Detector

Detect multi-step attack sequences across tool calls. No single action is dangerous, but the sequence reveals malicious intent: recon → credential access → exfiltration, escalation → destruction, and more.

Session Guard

One-line safety for agentic AI. Combines audit logging, attack chain detection, and threat intelligence into a single guard.check() call. Configurable block threshold, custom rules, and SIEM-ready JSON export.

Session Audit Trail

Tamper-evident logging with SHA-256 hash chaining. Every tool call, blocked action, and anomaly is recorded. Export to JSON for Splunk, Datadog, Elastic. SOC 2 and ISO 27001 compliance ready.

Enterprise Features

Policy engine (YAML), webhooks (Slack, PagerDuty), OpenTelemetry, API key auth, rate limiting, streaming protection, RSP-aligned risk reports.

Safety guardrails forevery LLM application

Try it live

Enterprise-grade safety, developer-friendly API

Prompt Injection Detection

PII Detection & Redaction

Tool-Use Safety

Harmful Content Filtering

Claude Code Integration

Adversarial Red-Teaming

Multi-Turn Conversation Safety

Structured Output Validation

Obfuscation Detection

Code Vulnerability Scanner

LLM API Firewall

SARIF Output

Security Audit

MCP Tool Schema Validator

Dependency Scanner

CLAUDE.md Scanner

Secrets Scanner

Canary Tokens

Prompt Hardening

Compliance Mapper

Agent Safety Monitor

SDK Guard Wrappers

Threat Intelligence Feed

Attack Chain Detector

Session Guard

Session Audit Trail

Enterprise Features

Works with your stack

Security Research

Why Your LLM Safety Layer Shouldn't Be Another LLM

5 Ways Your CLAUDE.md Can Be Weaponized

Multi-Turn Attacks That Single-Message Scanning Misses

When AI Agents Go Rogue: Behavioral Anomaly Detection

Building a Threat Intelligence Feed for LLM Attacks

One-Line Safety for Agentic AI: Introducing SessionGuard

Turning CLAUDE.md Rules into Deterministic Guardrails

Scanning AI-Generated Code for Vulnerabilities Before It Hits Disk

Start protecting your LLM apps today

Safety guardrails for
every LLM application