Detect prompt injections in 12 languages, redact PII, block harmful content, scan tool calls, and catch hallucinations. Sub-millisecond latency. Zero heavy dependencies.
Try the safety scanner and prompt hardening tool — runs entirely in your browser.
Catches instruction overrides, role injections, delimiter attacks, jailbreaks, prompt leaking, and encoded injection in 12 languages plus cross-lingual attacks.
Finds and auto-redacts emails, SSNs, credit cards (Luhn-validated), phone numbers, API keys, GitHub tokens, AWS keys, and passport numbers.
Scans agentic tool calls for dangerous commands, data exfiltration, credential access, and privilege escalation before they execute.
Blocks weapons manufacturing, drug synthesis, self-harm, malware creation, hate speech, phishing, and fraud instructions.
Drop-in PreToolUse hook, MCP server with 7 tools, 5 slash commands, and auto-invoked safety skill. One-command setup with sentinel init.
Built-in adversarial tester generates evasion variants using homoglyphs, zero-width chars, leetspeak, and more. Test your safety before attackers do.
Track safety across entire conversations. Detects gradual jailbreak escalation, topic persistence, sandwich attacks, and re-attempts after blocks.
Scan LLM-generated JSON for XSS, SQL injection, template injection, and path traversal hidden in field values. Schema validation included.
Catches encoded attack payloads that bypass keyword filters: base64, hex-encoded commands, ROT13 obfuscation, unicode escapes, and leetspeak variants of dangerous terms.
Scans LLM-generated code for OWASP Top 10 vulnerabilities: SQL injection, command injection, XSS, hardcoded secrets, insecure deserialization, weak crypto, SSRF, and path traversal.
Transparent reverse proxy for any LLM API (Anthropic, OpenAI). Scans all requests/responses, blocks dangerous content mid-stream, auto-redacts PII, enforces model allowlists. One command: sentinel proxy.
Generate SARIF v2.1.0 output for GitHub Code Scanning, Azure DevOps, and any static analysis tool. Upload results to the GitHub Security tab with one flag: upload-sarif: true.
sentinel audit scores your project's security configuration (0-100). Checks Claude Code hooks, permissions allowlist, .env files, pre-commit hooks, and MCP config with actionable fix suggestions.
Validate MCP tool definitions for injection vectors before trusting them. Detects prompt injection, authority impersonation, data exfiltration, and concealment instructions hidden in tool descriptions. sentinel mcp-validate.
Scan requirements.txt, package.json, and pyproject.toml for supply chain attacks. Detects typosquatting, known malicious packages, suspicious URLs, and dangerous install scripts. sentinel dep-scan.
Scan project instruction files (CLAUDE.md, .cursorrules) for 11 categories of injection vectors: hidden HTML comments, authority impersonation, base URL override, zero-width chars, and more. sentinel claudemd-scan.
Detect hardcoded API keys, tokens, and credentials across 40+ providers: AWS, GitHub, Google, Stripe, Slack, OpenAI, Anthropic, and more. Entropy analysis and smart filtering reduce false positives. sentinel secrets-scan.
Plant invisible markers in system prompts. If they appear in model output, your prompt was leaked. Two styles: HTML comment canaries and zero-width Unicode encoding (truly invisible).
Make system prompts injection-resistant with defense-in-depth: XML section tagging, sandwich defense, role lock, instruction priority markers, and input fencing for untrusted data.
Map scan results to EU AI Act, NIST AI RMF, and ISO/IEC 42001 controls. Automated risk classification, per-control compliance status, and remediation recommendations for regulatory reporting.
Track tool-use patterns across agentic sessions. Detects destructive commands, data exfiltration, credential access, runaway loops, write spikes, and read-then-exfiltrate attack chains in real-time.
One-line integration for Anthropic and OpenAI SDKs. guard_anthropic(client) wraps your client with automatic input/output scanning. Raises on injection attacks, PII leaks, or unsafe content.
MITRE ATLAS-aligned database of 27+ known LLM attack techniques. Query by category, severity, or tags. Match text against patterns. Covers injection, jailbreak, exfiltration, evasion, and more.
Detect multi-step attack sequences across tool calls. No single action is dangerous, but the sequence reveals malicious intent: recon → credential access → exfiltration, escalation → destruction, and more.
One-line safety for agentic AI. Combines audit logging, attack chain detection, and threat intelligence into a single guard.check() call. Configurable block threshold, custom rules, and SIEM-ready JSON export.
Tamper-evident logging with SHA-256 hash chaining. Every tool call, blocked action, and anomaly is recorded. Export to JSON for Splunk, Datadog, Elastic. SOC 2 and ISO 27001 compliance ready.
Policy engine (YAML), webhooks (Slack, PagerDuty), OpenTelemetry, API key auth, rate limiting, streaming protection, RSP-aligned risk reports.
Drop-in middleware for popular LLM SDKs and frameworks
Model-based guardrails add 100-500ms per scan. Here's how regex-based detection achieves sub-millisecond latency with 100% accuracy on known patterns.
Read article →
CLAUDE.md files are powerful configuration for AI assistants — but they're also attack vectors. Hidden comments, authority impersonation, and more.
Read article →
Split injection, progressive jailbreaks, and context manipulation — attacks that span conversation turns. How to detect them in real-time.
Read article →
AI agents can delete files, exfiltrate data, and escalate privileges. How to detect anomalous behavior patterns in real-time with tool-call monitoring.
Read article →
A MITRE ATLAS-aligned database of 27+ known attack techniques. Query by category, match text against patterns, and integrate with your safety pipeline.
Read article →
Drop-in safety layer combining audit logging, attack chain detection, and threat intelligence. One guard.check() call gives you real-time protection and SIEM export.
Read article →
CLAUDE.md rules are suggestions the model can ignore. ClaudeMdEnforcer converts them into deterministic checks that actually block violations before they execute.
Read article →
OWASP Top 10 detection and supply chain attack prevention. Catch SQL injection, hardcoded secrets, and malicious packages in real-time as Claude Code writes.
Read article →
Install in seconds. No API key required. No GPU needed.
JavaScript/TypeScript: npm install github:MaxwellCalkin/sentinel-ai#main