Understanding the THSP Protocol: A Deep Dive
A technical exploration of the Truth-Harm-Scope-Purpose protocol that powers Sentinel's decision validation. Learn how each gate works and why they must all pass.
Understanding the THSP Protocol: A Deep Dive
The THSP (Truth-Harm-Scope-Purpose) Protocol is the core decision validation framework powering Sentinel. In this post, we'll explore how each gate works and why the four-gate design is essential for robust AI safety.
Why Four Gates?
Traditional AI safety approaches often focus on a single dimension (usually harm prevention). But this creates blind spots:
The THSP Protocol addresses this by requiring ALL four gates to pass.
Gate 1: TRUTH
The Truth Gate validates factual accuracy. It asks: "Is this factually correct?"
This gate prevents:
Implementation uses a combination of:
Gate 2: HARM
The Harm Gate assesses potential for damage. It asks: "Could this cause damage?"
This gate evaluates:
Pattern matching identifies 2000+ harmful patterns across categories.
Gate 3: SCOPE
The Scope Gate enforces boundaries. It asks: "Is this within authorized limits?"
This gate ensures agents don't:
Scope is configurable per-agent, allowing precise access control.
Gate 4: PURPOSE
The Purpose Gate is unique to THSP v2. It asks: "Does this serve genuine benefit?"
This is the key insight: **the absence of harm is not sufficient**.
An action that:
...should still be blocked. This prevents:
Gate Interaction
Gates are evaluated in parallel for performance, but all must pass:
Decision → [TRUTH] ──┐
│
→ [HARM] ───┼──→ ALL PASS? → ALLOW
│
→ [SCOPE] ──┤
│
→ [PURPOSE]─┘
If any gate fails, the action is blocked with an explanation of which gate failed and why.
Configuring THSP
Each gate can be configured with different sensitivity levels:
const config = {
truth: { threshold: 0.85, requireSources: true },
harm: { categories: ['physical', 'financial'], threshold: 0.90 },
scope: { allowedActions: ['read', 'write'], maxCost: 100 },
purpose: { requireExplicit: true, minBenefit: 0.7 }
}
Conclusion
The THSP Protocol provides comprehensive decision validation by requiring four independent checks. This defense-in-depth approach catches threats that single-dimension systems miss.
For implementation details, see our [documentation](/docs/thsp-protocol).
The Sentinel Team
More from the Blog
Introducing Sentinel: The Decision Firewall for AI Agents
Today we launch Sentinel, a new approach to AI safety that protects the behavioral layer of autonomous agents. Learn why decision-layer protection is the missing piece in AI security.
Why AI Agent Security is Different from LLM Safety
LLM safety and AI agent security are related but distinct challenges. Here's why solutions designed for chatbots fall short when applied to autonomous agents.