Appearance
Content Gates
Content Gates let you configure custom content moderation rules that run on all chat inputs and outputs. They are an extra layer on top of the platform's built-in safety system — your rules cannot weaken system safety, only add to it.
How It Works
Every chat message passes through your rules in this order:
- System safety (always active, not configurable) — blocks CSAM, violence, hate speech
- Input validation (built-in) — detects prompt injection, PII
- Your input gates — blacklist and regex rules you configure
- AI generates response
- Output validation (built-in) — redacts PII, detects prompt leakage
- Your output gates — blacklist and regex rules you configure
Rules are evaluated cheapest-first. If a blacklist rule catches something, regex rules don't need to run.
Rule Types
Blacklist
Exact word or phrase matching. Case-insensitive by default. Enter one word/phrase per line.
Example: Block competitor mentions
CompetitorA
CompetitorB
their product nameRegex
Regular expression pattern matching. Use standard regex syntax.
Example: Block credit card numbers
\b\d{4}[\s.-]?\d{4}[\s.-]?\d{4}[\s.-]?\d{4}\bActions
| Action | Behavior |
|---|---|
| Block | Message is rejected entirely. User sees a generic error. |
| Warn | Message passes through, but the hit is logged in the audit dashboard. |
| Redact | Matched text is replaced with [REDACTED]. |
| Replace | Matched text is replaced with your custom replacement text. |
Direction
Each rule can apply to:
- Input — only checks user messages
- Output — only checks AI responses
- Both — checks both directions
Scope
- Org-wide (default) — rule applies to all widgets in your organization
- Widget-specific — rule only applies to a specific widget (set via API)
Templates
Pre-built rule templates are available for common use cases:
- PII Protection — email, phone, credit card, SSN, IBAN detection
- Competitor Blocker — block competitor name mentions (customize the list)
- Profanity Filter — basic profanity blocking
- Prompt Injection — extra prompt injection patterns
Install templates from the Content Gates page and customize them to your needs.
Audit Dashboard
The audit dashboard shows:
- 30-day summary — total blocks, warnings, and redactions
- Daily trend — gate trigger volume over the last 14 days
- Top rules — which rules are triggering most often
Use the audit dashboard to fine-tune your rules and understand what content your gates are catching.
Testing Rules
Before enabling a rule, use the built-in test panel in the create/edit modal. Enter sample content and see if your pattern matches — no need to deploy first.
API
GET /api/content-gates # List all rules
POST /api/content-gates # Create a rule
GET /api/content-gates/{id} # Get a rule
PUT /api/content-gates/{id} # Update a rule
DELETE /api/content-gates/{id} # Delete a rule
POST /api/content-gates/{id}/toggle # Enable/disable a rule
POST /api/content-gates/test # Test a pattern against content
GET /api/content-gates/templates # List available templates
POST /api/content-gates/templates/{slug} # Install a template
GET /api/content-gates/audit # Audit dashboard dataLimitations (v1)
- No LLM judge — v1 supports blacklist and regex only. An AI-powered judge layer (e.g., "reject if the response contains medical advice") is planned for v2.
- Chat only — gates currently run on widget chat. Content pipeline and playbook AI steps are not covered yet.
- No per-conversation overrides — rules apply at org or widget level, not per conversation.
- No bulk import/export — rules must be created individually (templates help for common patterns).
- No regex timeout protection — complex regex patterns could theoretically cause slowdowns. Keep patterns simple.
- No rule versioning — changes take effect immediately with no rollback. Test thoroughly before enabling.
Future Roadmap
- LLM Judge rules (AI-powered content classification)
- Content pipeline and playbook integration
- Rule versioning and rollback
- Webhook notifications on gate triggers
- Widget settings integration (show per-widget overrides inline)
- Bulk import/export of rules
- Regex DoS protection (RE2 mode)