Version: v0.0.39

Honeyframe Cloud

Honeyframe Cloud is the managed tier — we run the infrastructure for you. Customers sign in at app.honeyframe.io, create a Space in seconds, and land on <slug>.app.honeyframe.io with their data, branding, and users in place. No VMs to provision, no certificates to renew, no setup-customer.sh to run.

This page documents the operator-facing surface of the Cloud tier: what's running where, how to read provisioning state, and what's automated versus invite-driven today. For the customer-facing flow ("how do I sign up?") see Deployment Tiers.

The Cloud tier is invite-driven in v0.0.39 — the public Launchpad signup is gated until Phase 4 (billing + automated wildcard cert renewal). The control plane and shared-tenant provisioner are shippable today.

Architecture

Three pieces, all inside our managed account:

Component	Where	Purpose
Launchpad UI	`controlplane/frontend/`, served from `app.honeyframe.io` (Vite build)	Customer-facing: signup, list Spaces, create new Space, view live provisioning events.
Control plane API	`hub-control-plane.service` (port 8004), `control_plane.*` schema	Owns the Space lifecycle. Authenticates customers, dispatches provisioning, streams events.
PaaS install	The existing Honeyframe install (`paas/backend`, `paas/frontend`)	Hosts the actual tenant data. Shared by every Cloud Space.

The control plane shares the PaaS JWT secret and Postgres instance with the PaaS install but lives in its own systemd unit and its own schema. It does not import paas/ or saas/ — it's a peer service, not a tenant. The only shared resource is the license signing key (so per-tenant licenses are issued by the same authority).

                  app.honeyframe.io
                        │
          ┌─────────────┴───────────────┐
          │                             │
    Launchpad UI                  <slug>.app.honeyframe.io
    (Vite static)                  (PaaS frontend, branded per-org)
          │                             │
          └─── /api/v1/* ────────┐      └─── /api/* ───┐
                                 │                     │
                          hub-control-plane     PaaS backend
                          :8004                  :8000
                                 │                     │
                                 └────── Postgres ─────┘
                                       (control_plane.* + honeyframe.*)

Spaces

A Space is the unit of tenancy. One Space = one customer org.

CREATE TABLE control_plane.spaces (
  id              UUID PRIMARY KEY,
  slug            TEXT UNIQUE NOT NULL,         -- e.g. 'acme' → acme.app.honeyframe.io
  display_name    TEXT NOT NULL,
  owner_user_id   UUID NOT NULL,
  region          TEXT NOT NULL,                -- 'ap-southeast-5' for now
  tier            TEXT NOT NULL,                -- 'cloud' | 'enterprise' | 'self_hosted'
  status          TEXT NOT NULL,                -- see state machine below
  ecs_instance_id TEXT,                          -- Alibaba ECS i-xxx (enterprise only)
  ecs_public_ip   TEXT,
  installer_version TEXT,
  license_id      UUID,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  ready_at        TIMESTAMPTZ,
  suspended_at    TIMESTAMPTZ,
  deleted_at      TIMESTAMPTZ,
  last_error      TEXT
);

A Space is provisioned by one of two paths depending on tier:

Path	Used by	What happens	Wall-clock
Shared multi-tenant (`tier='cloud'`)	The Cloud tier	SQL inserts on the existing PaaS install: org → users → subscription → projects → nginx vhost → reload. No new VM.	~30s
Per-VM via SSH (`tier='enterprise'`)	The Enterprise tier (BYOC), and our internal pre-provisioning fleet	`asyncssh.connect` → SFTP-upload `install.conf` → exec `setup-customer.sh --json-events` → stream events live.	~30 min – 4 hr

Both paths emit one row per state transition into control_plane.provisioning_events so the Space Detail page can show a live event log (GET /api/v1/spaces/{id}/events?since=<id>).

State machine

requested
   ↓ provisioner picks the row up (cloud or enterprise dispatch)
allocating
   ↓ ECS running + SSH reachable (enterprise only — cloud skips this)
provisioning
   ↓ installer succeeds, /api/version returns the expected version
seeding
   ↓ DNS / nginx / license activated
ready

Two terminal states:

failed — last_error carries the truncated traceback. Operator clicks Retry to re-run the provisioner.
suspended / deleted — flipped by admin actions (Phase 4).

A stuck-job reaper runs in the control plane's lifespan: any space stuck more than 1 hour in a non-terminal state flips to failed with last_error="reaper: stuck > 1h" so the queue doesn't deadlock on a crashed provisioner.

Provisioning a shared-tenant Space

Eleven steps, each emitting one event row:

INSERT honeyframe.organizations (slug, display_name, billing_email)
INSERT honeyframe.users for the owner (must_reset_password=true)
INSERT honeyframe.user_orgs linking the owner to the org as admin
(Owner email confirmed — the bootstrap password is delivered by the Launchpad UI, not emailed in cleartext)
INSERT honeyframe.subscriptions (product='hub_platform', deployment_tier='shared') with license_tier='starter' (the constraint allows starter|professional|enterprise only — 'cloud' would fail it)
INSERT honeyframe.projects (org_id, name='default')
INSERT honeyframe.project_members for the owner as admin of the default project
Render /etc/nginx/conf.d/space-<slug>.conf from the slug template
Validate the rendered config (nginx -t)
systemctl reload nginx to pick up the new vhost
Issue HTTP-01 cert for <slug>.app.honeyframe.io (per-slug today; DNS-01 wildcard deferred to Phase 5)

On success the Space lands in ready and the customer can log in at <slug>.app.honeyframe.io.

The PaaS frontend reads window.location.hostname on load. If it matches <slug>.app.honeyframe.io, it calls /api/branding?slug=<slug> so the public branding endpoint overlays that org's logo + primary color from honeyframe.organizations onto the global defaults (the branded login surface).

Provisioning a per-VM Space (Enterprise / SSH)

When tier='enterprise' the dispatcher picks a different code path: provisioner.py (asyncssh-based).

Allocate ECS — Alibaba SDK RunInstances against the configured warm-pool image, or pick from a pre-allocated warm pool if any.
Wait for SSH — poll until asyncssh.connect succeeds (typically under 60s for warm-pool, ~90s for fresh).
Render install.conf — render_install_conf(space) maps spaces.install_conf JSONB onto the YAML grammar that setup-customer.sh reads (customer / tiers / database / domains / admin / features / openai).
SFTP upload — to /tmp/honeyframe-install-{space_id}.conf, chmod 600 (the file contains the DB password).
Exec installer — sudo bash {installer_path} --config {tmp_path} --json-events.
Stream events — _stream_events reads stdout line-by-line, json.loads each, INSERTs one row into provisioning_events. Lines that fail to parse get preserved as raw_log/info so forensics never loses output.
Drain stderr — parallel task; non-JSON output (sudo banner, pip output, ssh chatter) lands in the feed as warn so the operator sees it.
Wall-clock cap — 4 hours. (--compile-backend can take 3 hr on slow ECS — Nuitka link.)
On success — write /etc/nginx/conf.d/space-<slug>.conf on the new VM, run nginx -t + systemctl reload nginx, mark space ready.
On failure — flip to failed, store the truncated traceback in last_error, leave the VM running for forensics (operator decides whether to retry or destroy).

Host-pubkey pinning is a documented TODO (Phase 2.5 follow-up). Use a dedicated SSH key per provisioner identity and rotate it via the secrets surface.

Admin surface

/api/v1/admin/spaces is the fleet view (admin-only — control-plane has its own admin role flag). Filter by status to find stuck jobs:

curl -H "Authorization: Bearer $CP_ADMIN_TOKEN" \
  'https://app.honeyframe.io/api/v1/admin/spaces?status=failed'

POST /api/v1/admin/spaces/{id}/exec runs a one-off command on a Space's VM (enterprise tier only). Every call is audited into control_plane.audit_log with the full payload and the operator's user id.

For the cloud tier, the equivalent is direct SQL on the shared PaaS Postgres. A Space's data is identifiable by org_id — there is no separate VM to SSH into.

Tenant URLs

Pattern	What it resolves to
`app.honeyframe.io`	The Launchpad — signup, Spaces list, Create Space wizard
`<slug>.app.honeyframe.io`	A specific Cloud Space's PaaS frontend with the org's branding
`controlplane.honeyframe.io`	Internal admin console for the fleet (admin-only)
`*.honeyframe.io` (any other slug)	Reserved for future per-tenant Enterprise domains

The *.honeyframe.io wildcard A record was set up by the team 2026-05-01. Cert posture today: per-slug HTTP-01 via certbot for each new tenant. The DNS-01 wildcard cutover is Phase 5; once it lands a single cert covers every tenant slug.

API reference

The control plane lives on its own port (8004), so its surface is not under /api/dashboards/* or /api/connectors/* — those are the PaaS surfaces. The control-plane endpoints are all under /api/v1/:

Endpoint	Description
`POST /api/v1/auth/signup`	Create a control-plane user. Independent of the PaaS user surface.
`POST /api/v1/auth/login`	Returns a JWT scoped to the control-plane (signed with the same secret as the PaaS JWT, but the `aud` claim differs).
`GET /api/v1/auth/me`	Current control-plane user.
`POST /api/v1/spaces`	Create a Space (kicks off provisioner).
`GET /api/v1/spaces`	List spaces owned by the caller.
`GET /api/v1/spaces/{id}`	One space (owner-scoped).
`POST /api/v1/spaces/{id}/retry`	Re-run the provisioner on a `failed` space.
`POST /api/v1/spaces/{id}/suspend`	Flip status (Phase 4 stub).
`POST /api/v1/spaces/{id}/resume`	Flip status (Phase 4 stub).
`DELETE /api/v1/spaces/{id}`	Flip status (Phase 4 stub).
`GET /api/v1/spaces/{id}/events?since=<id>&limit=N`	Paginated provisioning event feed.
`GET /api/v1/admin/spaces`	Fleet view (admin-only).
`POST /api/v1/admin/spaces/{id}/exec`	One-off audited remote `exec` (enterprise tier).
`GET /api/health`	Liveness probe.
`GET /api/version`	Build + git SHA.

Observability

Two log streams worth watching:

Control-plane logs — systemd journal for hub-control-plane.service on the control-plane host. JSON-formatted; provisioning state transitions are logged at INFO, retries at WARN, traceback dumps at ERROR.
control_plane.provisioning_events — one row per state transition per space. Operator-facing; surfaced in the Launchpad UI's Space Detail page (2s polling). Joinable to control_plane.spaces.id.

For SSH-driven provisioning, the events table also captures every parsed line of the installer's --json-events output, plus stderr drain. A failed install leaves a complete forensic trail without needing to log into the target VM.

Gotchas

Cloud tier wildcard cert — HTTP-01 per-slug today. Tenant fleet > ~50 will hit Let's Encrypt rate limits. Move to DNS-01 wildcard before opening public signup.
Branded login depends on a hostname match. Non-*.app.honeyframe.io hosts (e.g., localhost, IP, platform.example.com, acme.example.com) fall through to global branding — this is by design.
license_tier='cloud' will fail the constraint. The constraint allows starter|professional|enterprise. The cloud-tier provisioner sets license_tier='starter' for new Spaces; tier-up via Phase 4 billing flow.
The control plane shares Postgres with the PaaS. Long-running PaaS migrations or vacuums can starve the provisioner. Schedule heavy maintenance during off-peak.
Admins can open any Space in the admin UI (was 404 before v0.0.39). Read-only by default; mutations go through audited /admin/spaces/{id}/exec only.

Roadmap

The Cloud tier is invite-driven in v0.0.39. Public self-serve signup needs Phase 4:

Phase	Status	Scope
1 — Strategy lock	Done (v0.0.39)	Three-tier model documented; Alibaba reselling decoupled from the Honeyframe brand.
2 — Control-plane API + SSH provisioner	Done (v0.0.39)	FastAPI service, asyncssh provisioner, admin fleet, e2e Playwright suite.
2.5 — Shared multi-tenant provisioner	Done (v0.0.39)	Cloud-tier dispatch path; `<slug>.app.honeyframe.io` branded login; nginx vhost-per-slug.
3 — Launchpad UI scaffold	Done (v0.0.39)	React + Vite + Tailwind v4, customer-facing signup + Spaces list + Create wizard + Space Detail.
4 — Billing, suspension, automated cert renewal	v0.0.40+	Stripe integration, true suspend/resume/delete, DNS-01 wildcard, public signup gate removed.
5 — DNS-01 wildcard cert	v0.0.40+	Single `*.app.honeyframe.io` cert via DNS-01 per renewal cycle; per-slug HTTP-01 retired.

Architecture​

Spaces​

State machine​

Provisioning a shared-tenant Space​

Provisioning a per-VM Space (Enterprise / SSH)​

Admin surface​

Tenant URLs​

API reference​

Observability​

Gotchas​

Roadmap​