Scheduler
The Scheduler is Honeyframe's single surface for time-driven automation: dataset refreshes, recipe builds, AI-agent runs, and email reports. It replaces three separate v0.0.37 surfaces (pipeline_schedules, report_schedules, and the per-project cron table) consolidated into one in v0.0.38.
Concepts
A schedule binds a trigger to a scenario. A scenario is a sequence of one or more steps.
| Term | Meaning |
|---|---|
| Trigger | When the scenario fires. Cron expression, interval, or "on upstream success". |
| Scenario | The unit of work the scheduler executes. One scenario can have many steps. |
| Step | A single action — run a recipe, sync a dataset, send a report, run a custom Python block. |
| Scenario template | A reusable scenario shape (steps + parameters) that any project in the org can instantiate. |
| Per-project enable matrix | Which scenarios are enabled for which project. A scenario can be defined org-wide and turned on for some projects but not others. |
Scenarios are configured under Project Settings → Schedules. Org-wide templates are configured under Admin → Scheduler templates.
Step types
| Step | What it does |
|---|---|
run_recipe | Build one or more recipes via dbt. Equivalent to a Flow subgraph build. |
sync_dataset | Trigger a connector → dataset sync. The same action as the dataset detail page Sync now button. |
run_python | Execute a Python recipe. Standalone — does not participate in the dbt build. |
run_agent | Invoke an AI agent against a row stream. |
send_email_report | Render a dashboard or set of dashboards as PDF/HTML and email to a recipient list. (v0.0.38 — replaces the standalone report_schedules table.) |
Steps within a scenario run sequentially by default. Mark a step as parallel: true in the scenario config to fan out parallel branches; the scheduler waits for all parallel steps to complete before continuing to the next sequential step.
send_email_report (v0.0.38)
The reports surface used to live in its own table (report_schedules) with its own CRON parser, its own delivery worker, and its own audit log. v0.0.38 folded it into the scheduler:
- Definition. A
send_email_reportstep in a scenario takes adashboard_idsarray, an optionaltime_filteroverride, a recipient list, and a delivery format (pdforhtml). - Rendering. The platform renders each dashboard via the same engine that powers the public-share path; the result is a PDF or inline HTML email body.
- Delivery. SMTP via the org's configured mail connector (see Connectors → Email). Failed delivery retries up to 3× with exponential backoff before logging an error.
- Filters. The
time_filteroverride is applied before rendering — useful for "every Monday email last week's numbers" patterns. Without it, the dashboard's own default filter applies.
Existing report_schedules rows from pre-v0.0.38 installs are migrated automatically on first boot of the new binary. The migration creates one scenario per old row, each with a single send_email_report step.
Scenario templates
A template defines a scenario shape that can be instantiated per project. Use templates when:
- The same recipe build runs across many projects (e.g. nightly mart refresh).
- A reporting cadence is org-wide policy (e.g. weekly leadership digest).
- A new project should automatically inherit a baseline schedule.
Template parameters are referenced as {{ param.name }} in step configs. Instantiating the template prompts for parameter values.
# Example template
slug: nightly-mart-refresh
display_name: Nightly mart refresh
parameters:
- name: mart_recipe
type: recipe_ref
trigger:
cron: "0 2 * * *" # 02:00 daily
steps:
- kind: run_recipe
recipe: "{{ param.mart_recipe }}"
- kind: send_email_report
dashboard_ids: [] # populated per project
recipients:
- "{{ project.owner_email }}"
Per-project enable matrix
Admin → Scheduler templates lists every template alongside a project-by-project on/off matrix. Toggling a cell enables or disables the template's scenario for that project. Two patterns:
- Default-on for new projects. Mark a template
default_enabled: true; new projects inherit it on creation. - Manual rollout. Leave
default_enabled: falseand flip the matrix per project as you roll out.
Disabling a template for a project pauses but does not delete the project's instantiated scenario; re-enabling resumes it without losing run history.
Schedule of record
The previous pipeline_schedules table is gone (v0.0.38). Any code or external integration referencing it will fail at startup with a schema-drift error — the platform's startup check (fix(scheduler): schema drift check at startup) explicitly looks for the dropped tables and aborts boot if they exist, since their presence indicates an incomplete upgrade.
If you have downstream tooling reading the old tables, point it at the new APIs:
| Old | New |
|---|---|
SELECT … FROM pipeline_schedules | GET /api/scheduler/scenarios |
SELECT … FROM report_schedules | GET /api/scheduler/scenarios?step_kind=send_email_report |
INSERT INTO pipeline_schedules … | POST /api/scheduler/scenarios |
Heartbeat and observability
The scheduler runs as a separate worker process (hub-platform-scheduler.service) that heartbeats to /api/pipeline/scheduler/health every 30 seconds. The heartbeat updates a scheduler_workers row with last_seen_at, worker_id, and current_scenario_id.
Health probes:
/api/pipeline/scheduler/healthreturns{"alive": true, "last_seen": "..."}if the worker has heartbeated within 90s, else503.- The Operations dashboard's Scheduler tile colors green / amber / red on heartbeat freshness.
Common failure modes:
- Heartbeat stale, worker process running — Worker is alive but couldn't write to
scheduler_workers. Usually a DB lock; checkpg_stat_activityfor long-running transactions. trigger_config500 — A scenario's trigger config failed validation at runtime (cron string parse error, undefined parameter). Fixed in v0.0.38; if you see it on v0.0.38+, the offending config is in the response body.- Silent FE failure — Pre-v0.0.38, the frontend swallowed scheduler errors in a global try/catch. Removed in v0.0.38; the UI now surfaces errors in the scenario detail panel.
API reference
| Endpoint | Description |
|---|---|
GET /api/scheduler/scenarios | List scenarios for the current project. |
POST /api/scheduler/scenarios | Create a scenario. |
GET /api/scheduler/scenarios/{id} | Scenario config + last 100 runs. |
PATCH /api/scheduler/scenarios/{id} | Update trigger, steps, or recipients. |
DELETE /api/scheduler/scenarios/{id} | Remove. Run history is preserved. |
POST /api/scheduler/scenarios/{id}/run | Trigger an out-of-band run, ignoring the scenario's trigger config. |
GET /api/scheduler/scenarios/{id}/runs | Paginated run history. |
GET /api/scheduler/templates | List org-wide templates. |
POST /api/scheduler/templates | Create a template (admin). |
POST /api/scheduler/templates/{slug}/enable/{project_id} | Enable for a project. |
POST /api/scheduler/templates/{slug}/disable/{project_id} | Disable for a project. |
GET /api/pipeline/scheduler/health | Worker heartbeat. |
Gotchas
- Schema drift hard-fails startup. If the upgrade left
pipeline_schedulesorreport_schedulesbehind, the platform refuses to boot. Drop them with the migration script (paas/scripts/migrations/drop_pipeline_schedules.sql) before retrying. - Cron timezone is UTC by default. Override per scenario via
trigger.timezone: "Asia/Jakarta"or set the org default in Project Settings → Defaults. send_email_reportdoes not honor dashboard public-share permissions. It renders as the scenario's owner — if the owner can't see a card, the email won't include it. Use a service-user owner for reports that span dashboards a single human user couldn't access.- Parallel steps run inside the same DB transaction by default. Long-running parallel steps can pile up locks; mark them
transactional: falseto commit per step.