Reviewer Blueprints¶

Aligning your reviewers is the single biggest lever for review throughput and accuracy. This page is a set of practical recipes for shaping spans and model_output so that the Tuor review workspace renders intuitively — even for non-technical reviewers.

Spans are provider-neutral, self-describing review units. Keep them in trajectory order and the rendering is automatic.

Reviewer context mapping¶

Reviewers need to see what your model saw and did alongside what your model produced. Each trace carries its own ordered span trajectory.

Messages and text¶

[
  {
    "type": "message",
    "role": "instruction",
    "content": [
      {"type": "text", "value": "You are a financial assistant. Always return JSON."}
    ]
  },
  {
    "type": "message",
    "role": "user",
    "content": [
      {"type": "text", "value": "Extract the net income from the attached statement."}
    ]
  }
]

Use instruction, user, and assistant roles. Do not omit an assistant message merely because the same answer also appears in model_output.

Documents and images¶

Media is a content block with a base64 source inside a message; filenames are optional display metadata.

[
  {
    "type": "message",
    "role": "user",
    "content": [
      {"type": "text", "value": "Extract the invoice fields."},
      {
        "type": "document",
        "source": {
          "type": "base64",
          "media_type": "application/pdf",
          "filename": "acme-invoice-2026-05-15.pdf",
          "value": "JVBERi0xLjQKJ..."
        }
      }
    ]
  }
]

Images use the same shape with type: "image" and a supported image MIME type. Note the 5 MB per-field cap; downsample or paginate large media before ingest.

Reasoning, tools, and unknown data¶

Use { "type": "reasoning", "content": "..." } for reviewable reasoning text.
Use tool_call spans with kind: "request" or kind: "response"; connect them with an optional caller-supplied tool_call_id.
Use unknown for arbitrary JSON that does not fit a known type. Tuor also applies this wrapper automatically to unsupported or malformed items at the smallest valid boundary.

Task-specific payload blueprints¶

Every output is structured JSON rendered as an editable path form. Nested object leaves appear as dot-path rows, arrays remain editable JSON values, and each row's control is inferred from the value's type. Describe a path in the trace's output_schema to pin its control: enum renders a label picker, boolean a true/false toggle, and json keeps the value as one editable JSON block instead of expanding it into child rows. The recipes below are all shapes of that one model.

Structured extraction¶

Best for contracts, invoices, receipts, forms — anything where the model produces structured JSON. The reviewer sees editable field paths where they can correct nested values, add missing paths, or remove spurious ones.

// POST /v1/traces/ — model_output
{
  "invoice": {
    "vendor_name": "Acme Corp",
    "invoice_date": "2026-05-15",
    "total_amount": 1250.45,
    "tax_id": "12-345678"
  },
  "line_items": [
    {"description": "Consulting", "amount": 1250.45}
  ]
}

Use output_schema for enum paths inside the structure, such as invoice.status or review.sentiment. Dot paths refer to nested object keys.

Classification / labeled fields¶

Best for routing, triage, and labeling tasks. The predicted label is just a field in model_output; describe it in output_schema as an enum and the reviewer gets a label picker, pre-selected to the model's choice, with the valid options one click away.

// POST /v1/traces/
{
  "model_output": {
    "review": {
      "label": "billing_dispute",
      "confidence": 0.97
    }
  },
  "output_schema": {
    "review.label": {
      "type": "enum",
      "options": ["billing_dispute", "cancellation_request", "technical_support", "account_settings"]
    }
  }
}

The label value stays a normal field, so anything else the model emits (here, confidence) rides alongside it. Two flags tune the picker:

Flag	Default	Effect
`multi`	`false`	`true` lets the reviewer pick several labels; the value becomes a list.
`open`	`true`	`false` restricts the field to `options` (no write-in values).

With the default open: true, reviewers can always type a value beyond options — leave options empty for a pure freeform label.

Boolean checks and raw JSON fields¶

For yes/no verdicts and nested payloads you don't want flattened, pin the control with a boolean or json field type. A boolean path renders as a true/false toggle (and rejects non-boolean values on ingest and correction); a json path keeps its whole value as one editable JSON block rather than exploding into child dot-path rows.

// POST /v1/traces/
{
  "model_output": {
    "passed_policy": true,
    "extracted_entities": {"people": ["Acme"], "amounts": [1250.45]}
  },
  "output_schema": {
    "passed_policy": {"type": "boolean"},
    "extracted_entities": {"type": "json"}
  }
}

Without a schema entry, controls are still inferred from each value's type — a boolean leaf gets a toggle, an array of primitives gets editable chips — so reach for explicit boolean/json types only when you want to lock that choice in.

Generative copy / summarization¶

Best for long-form text outputs: summaries, copywriting, translations, rewrites. A long string value renders in a wide, resizable text area; when a reviewer edits it, the original value stays visible for reference.

// POST /v1/traces/ — model_output
{
  "content": "The customer requested a refund due to a defective software module. The refund was processed successfully under policy guideline A-12."
}

Reviewer ergonomics¶

A few patterns that keep reviewers fast:

Use message text blocks for text-heavy tasks instead of burying the prompt in arbitrary JSON.
Stamp model and prompt_version_id in trace_config. They're displayed inline in the reviewer's trace metadata strip and included in exports for traceback.
Keep model_output shapes stable per project. Reviewers build muscle memory for where each field is. Shape drift across traces in the same project slows them down significantly.
Use tags to specialize trace lists. If you have heterogeneous traffic (e.g. PDFs for finance reviewers, contracts for legal), tag accordingly and have each reviewer filter to their domain.

Putting it all together: a worked example¶

Suppose you're building an invoice extraction pipeline. Each model call ingests a PDF, runs gpt-4o, and emits structured JSON with invoice-level fields and line items.

Project config

Configure the webhook URL and secret; the trace itself declares its document content.

Ingest call

curl -X POST "https://api.tuor.dev/v1/traces/" \
  -H "X-API-Key: $TUOR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "project_id": "proj_invoices",
    "spans": [{
      "type": "message",
      "role": "user",
      "content": [{
        "type": "document",
        "source": {
          "type": "base64",
          "media_type": "application/pdf",
          "filename": "acme-invoice-2026-05-15.pdf",
          "value": "JVBERi0xLjQK..."
        }
      }]
    }],
    "model_output": {
      "invoice": {
        "vendor_name": "Acme Corp",
        "invoice_date": "2026-05-15",
        "total_amount": 1250.45,
        "tax_id": "12-345678"
      },
      "line_items": [
        {"description": "Consulting", "amount": 1250.45}
      ]
    },
    "trace_config": {
      "model": "gpt-4o",
      "prompt_version_id": "prm_abc123",
      "internal_run_id": "run_01j2abcde"
    }
  }'

Reviewer experience

Left pane: the original PDF, scrollable.
Right pane: editable paths such as invoice.vendor_name, invoice.total_amount, and line_items.
Metadata: gpt-4o, prompt version prm_abc123, run ID run_01j2abcde.
Actions: approve, reject, or edit the output and submit the correction.

Webhook on correction

Your /tuor/invoices endpoint receives a signed review.corrected event; you look up the corresponding row by the event's trace_id (which you stored at ingest time) and update it with the corrected fields from final_output_after.

That's the full loop.