Honeyframe Cloud
Honeyframe Cloud is the managed tier — we run the infrastructure for you. Customers sign in at app.honeyframe.io, create a Space in seconds, and land on <slug>.app.honeyframe.io with their data, branding, and users in place. No VMs to provision, no certificates to renew, no setup-customer.sh to run.
This page documents the operator-facing surface of the Cloud tier: what's running where, how to read provisioning state, and what's automated versus invite-driven today. For the customer-facing flow ("how do I sign up?") see Deployment Tiers.
The Cloud tier is invite-driven in v0.0.39 — the public Launchpad signup is gated until Phase 4 (billing + automated wildcard cert renewal). The control plane and shared-tenant provisioner are shippable today.
Architecture
Three pieces, all inside our managed account:
| Component | Where | Purpose |
|---|---|---|
| Launchpad UI | controlplane/frontend/, served from app.honeyframe.io (Vite build) | Customer-facing: signup, list Spaces, create new Space, view live provisioning events. |
| Control plane API | hub-control-plane.service (port 8004), control_plane.* schema | Owns the Space lifecycle. Authenticates customers, dispatches provisioning, streams events. |
| PaaS install | The existing Honeyframe install (paas/backend, paas/frontend) | Hosts the actual tenant data. Shared by every Cloud Space. |
The control plane shares the PaaS JWT secret and Postgres instance with the PaaS install but lives in its own systemd unit and its own schema. It does not import paas/ or saas/ — it's a peer service, not a tenant. The only shared resource is the license signing key (so per-tenant licenses are issued by the same authority).
app.honeyframe.io
│
┌─────────────┴───────────────┐
│ │
Launchpad UI <slug>.app.honeyframe.io
(Vite static) (PaaS frontend, branded per-org)
│ │
└─── /api/v1/* ────────┐ └─── /api/* ───┐
│ │
hub-control-plane PaaS backend
:8004 :8000
│ │
└────── Postgres ─────┘
(control_plane.* + honeyframe.*)
Spaces
A Space is the unit of tenancy. One Space = one customer org.
CREATE TABLE control_plane.spaces (
id UUID PRIMARY KEY,
slug TEXT UNIQUE NOT NULL, -- e.g. 'acme' → acme.app.honeyframe.io
display_name TEXT NOT NULL,
owner_user_id UUID NOT NULL,
region TEXT NOT NULL, -- 'ap-southeast-5' for now
tier TEXT NOT NULL, -- 'cloud' | 'enterprise' | 'self_hosted'
status TEXT NOT NULL, -- see state machine below
ecs_instance_id TEXT, -- Alibaba ECS i-xxx (enterprise only)
ecs_public_ip TEXT,
installer_version TEXT,
license_id UUID,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
ready_at TIMESTAMPTZ,
suspended_at TIMESTAMPTZ,
deleted_at TIMESTAMPTZ,
last_error TEXT
);
A Space is provisioned by one of two paths depending on tier:
| Path | Used by | What happens | Wall-clock |
|---|---|---|---|
Shared multi-tenant (tier='cloud') | The Cloud tier | SQL inserts on the existing PaaS install: org → users → subscription → projects → nginx vhost → reload. No new VM. | ~30s |
Per-VM via SSH (tier='enterprise') | The Enterprise tier (BYOC), and our internal pre-provisioning fleet | asyncssh.connect → SFTP-upload install.conf → exec setup-customer.sh --json-events → stream events live. | ~30 min – 4 hr |
Both paths emit one row per state transition into control_plane.provisioning_events so the Space Detail page can show a live event log (GET /api/v1/spaces/{id}/events?since=<id>).
State machine
requested
↓ provisioner picks the row up (cloud or enterprise dispatch)
allocating
↓ ECS running + SSH reachable (enterprise only — cloud skips this)
provisioning
↓ installer succeeds, /api/version returns the expected version
seeding
↓ DNS / nginx / license activated
ready
Two terminal states:
failed—last_errorcarries the truncated traceback. Operator clicks Retry to re-run the provisioner.suspended/deleted— flipped by admin actions (Phase 4).
A stuck-job reaper runs in the control plane's lifespan: any space stuck more than 1 hour in a non-terminal state flips to failed with last_error="reaper: stuck > 1h" so the queue doesn't deadlock on a crashed provisioner.
Provisioning a shared-tenant Space
Eleven steps, each emitting one event row:
INSERT honeyframe.organizations (slug, display_name, billing_email)INSERT honeyframe.usersfor the owner (must_reset_password=true)INSERT honeyframe.user_orgslinking the owner to the org as admin- (Owner email confirmed — the bootstrap password is delivered by the Launchpad UI, not emailed in cleartext)
INSERT honeyframe.subscriptions (product='hub_platform', deployment_tier='shared')withlicense_tier='starter'(the constraint allowsstarter|professional|enterpriseonly —'cloud'would fail it)INSERT honeyframe.projects (org_id, name='default')INSERT honeyframe.project_membersfor the owner as admin of the default project- Render
/etc/nginx/conf.d/space-<slug>.conffrom the slug template - Validate the rendered config (
nginx -t) systemctl reload nginxto pick up the new vhost- Issue HTTP-01 cert for
<slug>.app.honeyframe.io(per-slug today; DNS-01 wildcard deferred to Phase 5) - Seed the starter pack — generic e-commerce sample data (customers / products / orders) plus a 7-card Welcome dashboard (3 KPIs, revenue trend line, top categories, top customers, recent orders). New tenants land on "wow this works" instead of an empty Datasets page. Self-contained, idempotent. Failure is non-fatal — a blank Space is still a working Space. Opt out with
CLOUD_STARTER_SEED=falseat the operator level.
On success the Space lands in ready and the customer can log in at <slug>.app.honeyframe.io.
Cloud-tier license recognition
Cloud signups bill via honeyframe.subscriptions, not the file-based org_licenses table that Enterprise / Self-Hosted tarball customers use. As of v0.0.44, when org_licenses is empty for an org AND there's an active subscriptions row with deployment_tier='shared' / product='hub_platform', the license-gate middleware synthesizes a permissive license payload from the subscription. Subscription state still controls revocation: past_due excludes the row from the fallback so the user hits /license — exactly the right escalation path.
The PaaS frontend reads window.location.hostname on load. If it matches <slug>.app.honeyframe.io, it calls /api/branding?slug=<slug> so the public branding endpoint overlays that org's logo + primary color from honeyframe.organizations onto the global defaults (the branded login surface).
Provisioning a per-VM Space (Enterprise / SSH)
When tier='enterprise' the dispatcher picks a different code path: provisioner.py (asyncssh-based).
- Allocate ECS — Alibaba SDK
RunInstancesagainst the configured warm-pool image, or pick from a pre-allocated warm pool if any. - Wait for SSH — poll until
asyncssh.connectsucceeds (typically under 60s for warm-pool, ~90s for fresh). - Render
install.conf—render_install_conf(space)mapsspaces.install_confJSONB onto the YAML grammar thatsetup-customer.shreads (customer / tiers / database / domains / admin / features / openai). - SFTP upload — to
/tmp/honeyframe-install-{space_id}.conf,chmod 600(the file contains the DB password). - Exec installer —
sudo bash {installer_path} --config {tmp_path} --json-events. - Stream events —
_stream_eventsreads stdout line-by-line,json.loadseach, INSERTs one row intoprovisioning_events. Lines that fail to parse get preserved asraw_log/infoso forensics never loses output. - Drain stderr — parallel task; non-JSON output (sudo banner, pip output, ssh chatter) lands in the feed as
warnso the operator sees it. - Wall-clock cap — 4 hours. (
--compile-backendcan take 3 hr on slow ECS — Nuitka link.) - On success — write
/etc/nginx/conf.d/space-<slug>.confon the new VM, runnginx -t+systemctl reload nginx, mark spaceready. - On failure — flip to
failed, store the truncated traceback inlast_error, leave the VM running for forensics (operator decides whether to retry or destroy).
Host-pubkey pinning is a documented TODO (Phase 2.5 follow-up). Use a dedicated SSH key per provisioner identity and rotate it via the secrets surface.
Admin surface
/api/v1/admin/spaces is the fleet view (admin-only — control-plane has its own admin role flag). Filter by status to find stuck jobs:
curl -H "Authorization: Bearer $CP_ADMIN_TOKEN" \
'https://app.honeyframe.io/api/v1/admin/spaces?status=failed'
POST /api/v1/admin/spaces/{id}/exec runs a one-off command on a Space's VM (enterprise tier only). Every call is audited into control_plane.audit_log with the full payload and the operator's user id.
For the cloud tier, the equivalent is direct SQL on the shared PaaS Postgres. A Space's data is identifiable by org_id — there is no separate VM to SSH into.
Tenant URLs
| Pattern | What it resolves to |
|---|---|
app.honeyframe.io | The Launchpad — signup, Spaces list, Create Space wizard |
<slug>.app.honeyframe.io | A specific Cloud Space's PaaS frontend with the org's branding |
controlplane.honeyframe.io | Internal admin console for the fleet (admin-only) |
*.honeyframe.io (any other slug) | Reserved for future per-tenant Enterprise domains |
The *.honeyframe.io wildcard A record was set up by the team 2026-05-01. Cert posture today: per-slug HTTP-01 via certbot for each new tenant. The DNS-01 wildcard cutover is Phase 5; once it lands a single cert covers every tenant slug.
API reference
The control plane lives on its own port (8004), so its surface is not under /api/dashboards/* or /api/connectors/* — those are the PaaS surfaces. The control-plane endpoints are all under /api/v1/:
| Endpoint | Description |
|---|---|
POST /api/v1/auth/signup | Create a control-plane user. Independent of the PaaS user surface. |
POST /api/v1/auth/login | Returns a JWT scoped to the control-plane (signed with the same secret as the PaaS JWT, but the aud claim differs). |
GET /api/v1/auth/me | Current control-plane user. |
POST /api/v1/spaces | Create a Space (kicks off provisioner). |
GET /api/v1/spaces | List spaces owned by the caller. |
GET /api/v1/spaces/{id} | One space (owner-scoped). |
POST /api/v1/spaces/{id}/retry | Re-run the provisioner on a failed space. |
POST /api/v1/spaces/{id}/suspend | Flip status (Phase 4 stub). |
POST /api/v1/spaces/{id}/resume | Flip status (Phase 4 stub). |
DELETE /api/v1/spaces/{id}?confirm_slug=<slug> | Real cascade deprovisioning (v0.0.44+). Six phases: honeyframe.* deletes in FK-safe order, DROP per-tenant t<pid>_* schemas, rm -rf {DATA_DIR}/orgs/<org_id>/, nginx vhost remove + reload, certbot revoke + delete. Each phase emits one provisioning_event. Failure marks status='failed_deletion' (not 'deleted') and keeps the row for inspection. The confirm_slug query param is a GitHub-style type-the-name guard against accidental curl/script deletes. |
GET /api/v1/spaces/{id}/events?since=<id>&limit=N | Paginated provisioning event feed. |
GET /api/v1/admin/spaces | Fleet view (admin-only). |
POST /api/v1/admin/spaces/{id}/exec | One-off audited remote exec (enterprise tier). |
GET /api/health | Liveness probe. |
GET /api/version | Build + git SHA. |
Observability
Two log streams worth watching:
- Control-plane logs — systemd journal for
hub-control-plane.serviceon the control-plane host. JSON-formatted; provisioning state transitions are logged atINFO, retries atWARN, traceback dumps atERROR. control_plane.provisioning_events— one row per state transition per space. Operator-facing; surfaced in the Launchpad UI's Space Detail page (2s polling). Joinable tocontrol_plane.spaces.id.
For SSH-driven provisioning, the events table also captures every parsed line of the installer's --json-events output, plus stderr drain. A failed install leaves a complete forensic trail without needing to log into the target VM.
Gotchas
- Cloud tier wildcard cert — HTTP-01 per-slug today. Tenant fleet > ~50 will hit Let's Encrypt rate limits. Move to DNS-01 wildcard before opening public signup.
- Branded login depends on a hostname match. Non-
*.app.honeyframe.iohosts (e.g.,localhost, IP,platform.example.com,acme.example.com) fall through to global branding — this is by design. license_tier='cloud'will fail the constraint. The constraint allowsstarter|professional|enterprise. The cloud-tier provisioner setslicense_tier='starter'for new Spaces; tier-up via Phase 4 billing flow.- The control plane shares Postgres with the PaaS. Long-running PaaS migrations or vacuums can starve the provisioner. Schedule heavy maintenance during off-peak.
- Admins can open any Space in the admin UI (was 404 before v0.0.39). Read-only by default; mutations go through audited
/admin/spaces/{id}/execonly.
Roadmap
The Cloud tier is invite-driven in v0.0.39. Public self-serve signup needs Phase 4:
| Phase | Status | Scope |
|---|---|---|
| 1 — Strategy lock | Done (v0.0.39) | Three-tier model documented; Alibaba reselling decoupled from the Honeyframe brand. |
| 2 — Control-plane API + SSH provisioner | Done (v0.0.39) | FastAPI service, asyncssh provisioner, admin fleet, e2e Playwright suite. |
| 2.5 — Shared multi-tenant provisioner | Done (v0.0.39) | Cloud-tier dispatch path; <slug>.app.honeyframe.io branded login; nginx vhost-per-slug. |
| 3 — Launchpad UI scaffold | Done (v0.0.39) | React + Vite + Tailwind v4, customer-facing signup + Spaces list + Create wizard + Space Detail. |
| 4 — Billing, suspension, automated cert renewal | v0.0.40+ | Stripe integration, true suspend/resume/delete, DNS-01 wildcard, public signup gate removed. |
| 5 — DNS-01 wildcard cert | v0.0.40+ | Single *.app.honeyframe.io cert via DNS-01 per renewal cycle; per-slug HTTP-01 retired. |