The developer-first platform for content moderation
Content filtering, user reporting, user blocking, appeals, policy versioning, and immutable audit trails - one API, production-ready in hours, not quarters.
Free tier: 15,000 decisions/mo - enough to moderate a 5,000-user app. No credit card required.
More than a filter API
An OpenAI moderation call gives you scores. Vettly gives you the full trust-and-safety stack around those scores.
| Feature | Build In-House | Generic Moderation API | Vettly |
|---|---|---|---|
| Content filtering | Yes, multi-provider | ||
| User reporting | Yes, built-in | ||
| User blocking | Yes, built-in | ||
| Audit trail | Yes, immutable + exportable | ||
| Policy templates | Yes, pre-built | ||
| Appeals workflow | Yes, built-in | ||
| Policy version pinning | Yes, every decision | ||
| Decision replay | Yes, test new policies on old data |
Every decision is traceable, contestable, and replayable
A regex gives you a yes/no. An LLM call gives you a score. Vettly gives you a decision record that tracks policy version, appeal outcome, reviewer action, and retention status - so when trust, legal, or App Review asks, you have the answer.
Intake
Content and context
Policy evaluation
Under a specific version
Action taken
And recorded
History stored
With retention rules
Appeal or review
If contested
Search, export, replay
Whenever you need
Three integration steps. Six workflows behind the scenes.
You integrate three endpoints. Vettly handles filtering, reporting, blocking, appeals, audit trails, and policy versioning behind them.
1. Add one moderation call
Call /v1/check before content is published.
Policy evaluation, graduated actions, category scoring, and full decision logging.
2. Turn on report + block
Enable user reporting and blocking endpoints.
Report routing, resolution tracking, mutual-block enforcement, and appeals workflow.
3. Ship with audit evidence
Every decision logged with the policy version that produced it.
Decision replay, compliance exports, retention rules, and searchable history.
Built for teams shipping user-generated content.
Define your content policy in plain English. Vettly enforces it across text, images, and video with decisions your team can trace, explain, and defend.
Social & Community Apps
Chat, comments, and user profiles. Filter text and images before they appear, add report buttons, and let users block each other.
- Chat messages & comments
- User profiles & bios
- Photo & video sharing
Chat, comments, and user profiles. Filter text and images before they appear, add report buttons, and let users block each other.
- Chat messages & comments
- User profiles & bios
- Photo & video sharing
See it in action
Edit the text, click Analyze, and see the real API response.
This demo uses a rate-limited API key. For production use, sign up for a free account.
Get free API keyname: community-safeversion: "1.0"rules:- category: hate_speechthreshold: 0.6action: block- category: harassmentthreshold: 0.6action: flag- category: violencethreshold: 0.7action: warn- category: spamthreshold: 0.5action: flag
{"decisionId": "dec_xxxxxxxx","action": "allow","safe": true,"categories": [],"latency_ms": "---"}
What happens next in the backend
Consistent API
Same JSON structure for every request. Text, images, video, one predictable format.
Built for Speed
Optimized for low-latency responses. See real performance in the playground above.
Your Data, Your Control
We don't train on your content. Configure retention policies to match your requirements.
Production Infrastructure
Deployed on enterprise-grade cloud infrastructure with monitoring and alerting.
Everything your moderation stack needs, nothing extra
Filtering, reporting, blocking, and auditable decision history in one API key.
Content Filtering
Screen text, images, and video against your policy before content goes live. Blocks hate speech, sexual content, violence, and spam.
User Reporting
Let users report offensive content with one API call. Reports are tracked, assigned, and resolved with a full audit trail.
User Blocking
Add users to a blocklist so they can't contact or interact with the reporter. Built-in to the API.
Audit Trail
Every decision is pinned to the policy version that produced it. Replay past decisions against new policies, export for legal holds, and trace appeals from dispute to resolution.
Policy Templates
Pre-built starter policies for common moderation use cases. Start in minutes, customize when you need to.
Appeals Workflow
Handle disputes and overturn decisions. Users can contest blocks, and moderators can review with full context.
Rules you can actually read
Policies are explicit. They are written in versioned configuration, not buried in model behavior. Old decisions are never reinterpreted. Policy changes apply forward, not retroactively. This is how automated systems earn trust.
name: community-safedescription: UGC safety baselinerules:- category: sexualthreshold: 0.6action: reject- category: hatethreshold: 0.5action: reject- category: suggestivethreshold: 0.8action: flag
Block explicit content
Reject sexual content scoring above 60%
Block hate speech
Block content with hate speech above 50% confidence
Flag borderline content
Flag suggestive content for human review above 80%
Custom Prompts
Ask AI anything about your images. Write rules in plain English like "Is this counterfeit?" or "Does this food look undercooked?"
customPrompt: "Does this show counterfeit luxury goods?"One API call. Six workflows.
This Discord bot calls one endpoint. Behind that endpoint: policy evaluation, graduated actions, decision logging, appeal handling, compliance exports, and policy versioning. All automatic.
One API call. Every decision recorded.
This bot took an afternoon to build. The decision logic? Just one line of code. Try it on your server or check out the source to build your own.
- Full Discord bot in ~200 lines
- Text and image evaluation
- Policy versioning per server
- Open source on GitHub
🦞 Stop risky agent actions before they execute
Use Vettly as the guardrail control plane for OpenClaw. Enforce fail-closed runtime decisions, approval thresholds, blocked command patterns, and policy rollback in one workflow.
In-path authorization for shell, file, network, and env access.
Fail-closed defaults so outages do not silently bypass safety.
Presets, policy history, restore, and operating metrics in dashboard.
How it works
1. Vet skills before install.
2. Authorize runtime actions in-path.
3. Enforce allow, warn, flag, or block.
4. Track metrics and rollback policy versions.
Core endpoints
POST /v1/openclaw/guardrails/skill-vettingPOST /v1/openclaw/guardrails/action-authorizeGET /v1/openclaw/guardrails/metrics?days=30Simple, predictable pricing
Start free. Upgrade when you need more volume, longer history, or advanced workflows.
Every plan includes decision IDs, policy versioning, and searchable history.
Free
Enough for a 5,000-user app
- 15,000 decisions / month
- Text, images & video
- Starter policy templates
- Decision IDs & audit trail
- 24-hour decision history
- Policy versioning
Starter
Ship your MVP without hitting limits
- 100,000 decisions / month
- Everything in Free
- 1-year decision history
- Custom policy rules
- Decision webhooks
Growth
Ship publicly with stronger protection
- 300,000 decisions / month
- Everything in Starter
- Spam & scam detection
- Advanced risk controls
- 3-year moderation logs
Pro
Custom rules, priority processing, compliance
- 1,500,000 decisions / month
- Everything in Growth
- Custom moderation rules
- Priority processing
- Compliance & audit support
- Priority support
Enterprise
Custom infrastructure at scale
- Everything in Pro
- Unlimited decisions
- Multi-provider routing
- Batch moderation API
- SLA guarantees
- Dedicated support
All plans include: decision IDs, policy versioning, searchable history, and a unified API for text, image, and video.
Overage pricing (all paid plans): text $0.0001/unit · image $0.001/unit · video $0.01/unit, prorated and billed monthly.