v0.0.72 — Team Reviews + reviewer queue
Released: 2026-05-15. Five commits.
This release layers a team-review surface on top of the Agent Review subsystem from v0.0.70 — teams can now disagree with the LLM judge without overwriting its verdict, and a queue inbox surfaces what still needs attention.
Reviewer notes + per-trait thumbs (#19 first slice)
Schema migration add_agent_team_reviews.sql adds reviewer_notes / user_id / reviewed_at to agent_test_results plus a new agent_test_trait_votes table (UNIQUE per result/trait/user, flippable up/down).
Four endpoints under /api/agent-reviews:
PATCH .../results/{rid}— save reviewer notes, stamp the audit.POST/DELETE .../results/{rid}/traits/{name}/vote— flip a thumb.GET .../runs/{rid}/resultsextended with the votes aggregate + notes, all in one batched SQL (no N+1).
ResultDrawer gains a debounced-autosave notes textarea on the Details subtab and per-trait thumbs (optimistic update, rollback on error) on the Traits subtab.
Contested pill + vote counts
TraitsMatrix surfaces team disagreement at a glance:
- Contested pill renders when reviewer thumbs lean opposite the judge's verdict (
passwith majority 👎, orfailwith majority 👍). - When not contested, raw 👍 N / 👎 N counts show inline.
Drives "did v4 regress, or does the team agree it's fine?" scans without opening each result drawer.
Reviewer queue inbox
GET /api/agent-reviews/queue — org-scoped inbox of every result still needing attention. Needs review = reviewed_at IS NULL AND (overall_status in fail/error OR has a contested trait). Contested is computed inline via a CTE so the queue stays in one query.
New page at /agent-reviews/queue with All / Failures / Contested filter pills (live counts) and a Review queue button on the Reviews list page. Row click jumps to the review detail with run + result query params — AgentReviewPage reads ?run=X&result=Y on mount, selects that run (falls back to most-recent if the id isn't in the loaded list), jumps to the Results tab, and opens ResultDrawer on the matching result. Query params clear after consumption so re-renders don't re-trigger.
Closes the queue → drawer loop — reviewers click a queue row and land directly on the result they need to triage.