⚡ MCP & REST · built for agents

Your coding agent orders a real human test — and gets clean data back.

SeenByHuman is built first for AI agents (Cursor, Claude Code, Lovable, Bolt). After your agent ships an app, it calls one tool: a real native human runs the app and returns a structured verdict your agent can act on. Humans can use the catalog too — but the agent path is the headline.

Why an agent needs this

Automated checks tell you the button fires. They can't tell you the German paywall copy reads like a scam, the onboarding feels unfinished, or the flow loses a real person. That human judgment — especially across languages and markets — is exactly what an agent can't self-supply. So it outsources it, cheaply, in one call.

Fixed price up front

The agent knows the cost before ordering. No quotes, no negotiation.

Structured back

Verdict + issues by severity + repro, anchored to recording timestamps (anti-hallucination).

No report, no pay

Money sits in escrow; released only when a real report (recording) is delivered.

Set a budget — we allocate the humans

You don't pick people. You set a budget, a task type and requirements; the network allocates the right number of humans for it.

Budget	Example allocation
$100	20 testers × $5 · or 10 × $10 · or 4 verified × $25
$100	100 people × $1 — one opinion / survey answer each
$5	1 quick run (5–15 min) or a handful of micro-tasks

⚡ Starts in minutes

Matched humans get an instant push, accept, and begin within minutes — not days.

🤖 Agent-rated

Your agent rates each report's quality (completeness, accuracy, evidence) — that score drives the tester ranking, no human in the loop.

🛑 Auto-exit on crash

If the app won't launch, the tester sends a crash screen, the job auto-closes, they get a small partial fee and the rest of your budget returns.

It's not only app testing — any task that needs a human eye: UX A/B (which screen/onboarding/icon wins), opinions & surveys (name, color, "would you pay?"), content checks (does this read naturally / not like AI), local checks (does it make sense in PL/DE/IT), accessibility, game feel.

The one tool

Conceptual MCP tool shape (contract preview — see links below):

# one step inside a vibe-coding session
tool seenbyhuman.order_test(
  url="https://my-app.lovable.app",
  check="German paywall — does the copy sound trustworthy?",
  market="DE",            # native human from that market
  budget=12,              # USD, fixed
  deliverable="recording+json"
)

# → structured verdict comes back
{
  "verdict": 7,
  "issues": [
    { "severity": "critical",
      "what": "paywall copy reads as scam (DE)",
      "evidence": "recording@0:41" }
  ],
  "recording_url": "https://…",
  "tester": { "market":"DE", "rating":4.8 }
}

Test types an agent can request

Type	What it gets	From
Sanity check	One critical path, recording + 1–10 verdict	$8
Full run	Whole app, edge cases, UX opinion	$18
Localization sweep	N native humans, one per market — copy & trust	$39 (2) / $99 (5)
Update loop	4×45 min sessions, a human re-checks after each change	$20
Micro-task	Rate screenshots / 60-sec first impression / survey	$0.50–$1

Contract (early access)

Machine-readable definitions you can point an agent or client at:

→ openapi.json (REST shape) · mcp_manifest.json (MCP tool definitions)

⚠️ Early access: the public API/MCP endpoints are in setup — this page documents the contract and flow. To be a launch partner (first agents/dev-shops wired in), leave your email. Tester profiles shown in the catalog are illustrative while we onboard the first real founding testers.

Become a launch partner →