A drop-in REST API that detects and neutralizes injection attacks in any text — git commits, web pages, files, emails, user input — before your AI agent sees it.
Free tier: 10,000 requests/month · No credit card · Cloudflare edge · <50ms p99
Paste any text. We'll show you what we'd flag.
// Click "Scan" to see results.
~20 categorized regex rules cover the patterns we've seen in the wild: instruction overrides, role hijacks, ChatML/Llama special tokens, exfiltration channels, schema attacks, invisible-Unicode smuggling.
Open-source: github.com/bch1212/promptshield
Cloudflare Workers AI runs a transformer model that catches paraphrased and obfuscated attacks the regex misses. Capped contribution prevents false-positive flips on benign text.
The same text in a git commit is more suspicious than in user input. Sensitivity (low / medium / high) lets you tune for your threat model per integration point.
10,000 requests/month free. We email your key.
curl -X POST https://api.promptshield.dev/v1/scan \
-H "Authorization: Bearer ps_live_..." \
-H "Content-Type: application/json" \
-d '{"text":"ignore previous instructions","context":"user_input"}'
import requests
r = requests.post(
"https://api.promptshield.dev/v1/scan",
headers={"Authorization": f"Bearer {API_KEY}"},
json={"text": untrusted_text, "context": "web_content"},
).json()
if not r["safe"]:
raise RuntimeError(f"Injection detected: {r['threat_type']}")
const r = await fetch(
"https://api.promptshield.dev/v1/scan",
{
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ text, context: "user_input" }),
},
).then(r => r.json());
if (!r.safe) throw new Error("blocked: " + r.threat_type);
Self-serve. Cancel anytime.
$29/mo
$99/mo
$499/mo
Overage on metered tiers: $0.50 per additional 1M requests.
PromptShield reduces, but does not eliminate, prompt-injection risk. Use it as one layer alongside system-prompt hardening, tool sandboxing, and output scrubbing. Our open-source ruleset is community-driven — file an issue with novel attacks and we'll merge them.
No-logging mode (Pro tier): no input text is stored or used for model training.