Skip to main content
Version: v0.1.8

v0.0.76 — Self-healing dashboards + G11 + G22

Released: 2026-05-18. Twenty-five commits.

Two big tracks: a self-healing dashboard loop (rule-based Diagnose + LLM narrative + LLM suggester + cross-card aggregator + autofix telemetry) and two new card types that close the last gaps in the card vocabulary (G11 dot strip, G22 scorecard heatmap).

Dashboards — self-healing loop

A new Diagnose affordance lands in dashboards: a deterministic rule engine surfaces broken/misconfigured cards, LLM layers explain and suggest, and a cross-card aggregator collects findings across a whole dashboard.

Slice 1 — Rule-based card diagnose

services/card_diagnose.py ships four rules covering the bug classes the team just hit:

  • MOCK_STUCK_WITH_REAL_SQLcard_config.data_source='mock' + SQL has a FROM clause → fix = PATCH stripping mock fields.
  • UNBOUND_PARAM_WITH_TOOLBAR_MATCH{{name}} not in config defaults + toolbar column matches the <col>_start/_end convention → fix = one-click wire-filter call with toolbar defaults.
  • UNBOUND_PARAM_NO_MATCH{{name}} unbound, no toolbar match → surface the names + how-to hint, no auto-fix.
  • FILTER_SKIPPED_REWIRABLE — toolbar column not referenced in SQL and not parameterized → fix = wire-filter offer.

POST /api/dashboards/{did}/cards/{cid}/diagnose is read-only. Optional toolbar_state body. Returns {findings: [{code, severity, title, detail, fix?}]} sorted error → warning → info. Each fix payload describes a one-click POST / PATCH to an existing endpoint; the diagnose endpoint itself never mutates state.

Frontend (DashboardDetailPage): per-card menu gains a Diagnose item between "Ask AI to Modify" and the layout-editing items. The drawer renders severity-styled cards (error red / warning amber / info sky), per-finding Apply fix button disabled across the panel while one is in flight, and re-diagnoses after a fix lands. Empty state shows a green "no issues detected" card.

Slice 2a — LLM narrative wrapper

services/card_explain.py builds a deterministic system + user prompt (truncating huge SQL / card_config to 1.2KB / 600B caps). POST /dashboards/{did}/cards/{cid}/diagnose/explain runs gpt-4o-mini at temperature 0.2 with per-org BYO-key. Default language id (Bahasa Indonesia); en supported explicitly.

The Diagnose drawer gains a sky/indigo Plain-language explanation panel above the findings list. Two language buttons (ID / EN); click either to fetch. LLM failures surface as HTTP 502 — the drawer stays usable even if the LLM is down.

Slice 2b — LLM novel-case suggester

When the deterministic rule engine returns clean but the card might still be broken (typo, opaque Postgres error, suspicious JOIN), ask an LLM for short text suggestions. Operator reads them and fixes manually via the existing Edit Card drawer — slice 2b deliberately does not generate fix payloads. Preserves the invariant: rules + fixes are deterministic, LLM is presentation only.

services/card_suggest.py has a tolerant JSON parser (strips ``` fences, extracts first {...} block, drops items missing {message}, hard-caps at 8 items, truncates each message at 600 chars). POST /dashboards/{did}/cards/{cid}/diagnose/suggest accepts recent_error + toolbar_state + language. recent_error from the FE's cardResults[id].error so the LLM can translate relation "marts.fact_x" does not exist into an operator-friendly hint.

Frontend: a purple Look harder with AI panel below the narrative panel. Empty-array reply ("nothing else caught the AI's eye") renders explicitly — distinguishes "AI ran and found nothing" from "AI never ran".

Slice 4 — Cross-card aggregator

POST /api/dashboards/{did}/diagnose runs the rule engine across every card on a dashboard and returns grouped results so the operator can tackle systemic issues in one pass instead of opening N drawers. Returns by_card, by_code (pivot the other way so the FE can show "MOCK_STUCK affects 3 cards"), and totals. Serial loop (rule engine is sub-millisecond per card). 200-card cap; single telemetry row per call, not N rows.

Frontend slice 4b adds a Diagnose dashboard toolbar button between the active-filter pill and History. The drawer renders one collapsible row per code with worst-severity badge + "N cards affected" pill; expanding reveals the affected cards list with click-through into the per-card drawer.

Slice 5 — Telemetry

Slice 5a writes one row to honeyframe.agent_tool_executions per /diagnose / /explain / /suggest call. PII guard: telemetry rows never contain card SQL, card_config payload, recent_error content, or narrative / suggestion text. Failure isolation: telemetry write failures never block the operator.

Slice 5b extends telemetry to autofix — every fix the operator actually applied via the Diagnose drawer's Apply button logs a card_autofix_patch (PATCH path) or card_autofix_wire (wire-filter path) row, source-tagged diagnose_drawer:<finding_code>. Regular editor saves stay quiet (critical for audit log size); future channels (agent_orchestrator etc.) don't pollute the diagnose telemetry bucket.

Together, 5a + 5b answer: "of the N findings surfaced this week, K were applied; of those K, M were MOCK_STUCK fixes; 4 of those reverted within an hour." Required before any silent auto-apply path ships.

Card types — G11 dot strip + G22 scorecard heatmap

First new card types since v0.0.69. Closes the last two gaps in the card-type vocabulary.

G11 — dot_strip

Categorical-Y dot strip with per-row thresholds. New dispatch branch after scatter, renders via ScatterChart with:

  • Categorical Y axis (one tick per category).
  • Deterministic per-entity jitter (FNV-1a hash → stable y-offset so dots don't move between renders).
  • Per-row threshold markers via ReferenceLine segment.
  • Three-state color (ok / breach / unavailable) with config override.
  • Threshold precedence: per-row column > per-category map > global.

Config schema: dot_strip.{category_column, x_column, entity_key_column, entity_label_column, threshold_column, availability_column, status_column, direction}, plus thresholds, global_threshold, density.{jitter, alpha}, colors.{ok, breach, unavailable}.

G22 — scorecard_heatmap

Wide HTML table with per-cell RAG bg/fg colors derived from per-column threshold rules. Threshold DSL supports basis=value|attainment|basis_column + direction=higher_is_better|lower_is_better + green/amber/red bounds. status_column override for non-monotonic rules. header_count shows the red-status branch count per column header. Sticky entity column + subtitle row + alternating row bg + configurable status_palette. value_format DSL: decimal_2 / percent_already_0 / etc.

Both types are added to VALID_CARD_TYPES + VALID_AI_CARD_TYPES and AI_DASHBOARD_PROMPT so the AI dashboard generator can pick them when appropriate.

Wire-filter — bug 4 (double-wrap)

the team reported a 4th variant of the wire-filter cluster post-v0.0.75: after wiring a card with {{appointment_date_start/end}}, the toolbar's date filter still wrapped the SQL with a SELECT * FROM (... WHERE appointment_date BETWEEN :p_..._start AND :p_..._end) filtered_q WHERE appointment_date BETWEEN $1 AND $2 — and the inner COUNT(*) doesn't expose appointment_date, so Postgres raised UndefinedColumnError.

Root cause: sql_binds_column_via_param regex only matched raw {{name}} tokens, but _apply_parameters rewrites them to :p_<name> before _filter_column_in_sql runs. Fix: extend _PARAM_BIND_CONTEXT to match both {{name}} and :p_<name> forms. A PARAM_BIND_PREFIX constant is the new single source of truth across wire_filter.py and _apply_parameters, pinned by TestParamBindPrefixContract.

KPI hero chrome

Hero KPIs get richer chrome: delta chip + compact height + % suffix.