Skip to main content
Version: v0.0.74

Connectors

A connector tells Honeyframe how to reach an external system — a database, an object store, a vector index, an LLM API, or a webhook target. Connectors are configured once at the project level and reused across datasets, recipes, and agent tools.

The connector implementations live under paas/backend/connectors/. Each connector is registered in a central registry; the catalog endpoint (GET /api/connectors/catalog) returns every type the running platform supports along with its config schema.

Queryable vs non-queryable

Connectors split into two groups based on the is_queryable flag.

Queryable — can be the source of a dataset or a SQL query:

  • PostgreSQL (postgresql) — also the platform's own metadata store.
  • MySQL (mysql) and MariaDB (mariadb).
  • Microsoft SQL Server (mssql).
  • Oracle (oracle) — schema/table names must be uppercase.
  • Snowflake (snowflake).
  • BigQuery (bigquery).
  • MongoDB (mongodb) — document store, queryable through the dataset surface.
  • Elasticsearch (elasticsearch) — full-text search and aggregations.
  • REST API (api_rest) — generic HTTP connector for sources without a first-class driver.
  • CSV upload (csv) — accepts user-uploaded CSV/Excel and persists rows as a managed dataset. Subject to the nginx client_max_body_size (default 200 MB).
  • Object storages3_storage, gcs_storage, oss_storage. Queryable through the Lakehouse layer (DuckDB over Delta/Parquet), not direct SQL.

Non-queryable — cannot back a dataset; used by other surfaces (agents, knowledge bases, automation):

  • LLM providersopenai_llm, anthropic_llm, ollama_llm. Used by Agent Builder, the SQL chat surface, Cobuild, the dashboard chat panel, and Knowledge Base retrieval. Per-org BYO keys live here as of v0.0.56 (category='llm') — the legacy /ai-keys page is retired and redirects to /connectors.
  • Vector storeschroma, faiss. Persist embeddings for the Knowledge Base.
  • Orchestrationn8n_webhook (fire events at an n8n workflow).
  • Messagingtwilio_messaging (used by the send_whatsapp agent tool).

Legacy type aliases (e.g. rdspostgresql) are recognised by the registry for backward compatibility with older installs. New connectors should use the canonical type names listed above.

Configuring a connector

The Connectors page lists every active connector for the current project. On a fresh install it's empty:

Empty Connectors page

Click + Add Database Connector (or + Add Connector for non-DB types) to open the type picker. Honeyframe shows every supported type with a short description:

Connector type picker

After picking a type, fill the type-specific fields (host / port / credentials for SQL, API base URL + key for REST APIs, bucket + region for object storage, etc.) and submit. The new connector appears in the catalog with a Test button:

Connectors page populated

In the platform UI:

  1. Open the Connectors page in the sidebar (org admins only).
  2. Click + New Connector and pick a type from the catalog.
  3. Fill in the connection parameters. Sensitive fields (passwords, API keys, service-account JSON) are encrypted at rest with AES-GCM into a sibling connection_secrets_enc column — they never touch the JSONB config column. GET responses mask secrets with ••••••••; PUT preserves the sentinel so unchanged secrets stay untouched.
  4. Test verifies the connection without saving. The result lands in the test history pane.
  5. Save writes the connector to the data_connectors table. It becomes selectable when creating datasets, recipes, or agent tools.

Programmatically:

curl -X POST https://platform.your-domain.com/api/connectors \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production Postgres",
"type": "postgresql",
"config": {
"host": "db.example.com",
"port": 5432,
"database": "analytics",
"username": "honeyframe_ro",
"password": "<redacted>",
"sslmode": "require"
},
"output_schemas": ["public", "marts"]
}'

output_schemas is the list of schemas the platform may discover and read from. Leave it empty to default to the connector's "owned" schemas; set it explicitly to restrict what shows up in the dataset browser.

Auto-sync schedule

Queryable connectors can be created with an attached schedule via the bootstrap endpoint:

curl -X POST https://platform.your-domain.com/api/connectors/bootstrap \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production Postgres",
"type": "postgresql",
"config": {...},
"schedule_cron": "0 2 * * *",
"schedule_mode": "incremental"
}'

schedule_mode is incremental (only changed rows since the last successful run, requires a watermark column) or full (truncate + reload). The scheduler runs every minute; the cron expression decides which connectors fire. Skipped runs are logged but not retried — the next cron tick is the next chance.

Permission model

Connector access splits off org.admin into two scoped permissions as of v0.0.55 — designers can manage data sources without full org-admin (the Dataiku model where builders own their flows):

Endpoint groupPermission
Catalog, scripts, list, get, models (read paths)connector.read
POST, PATCH, DELETE, test, bootstrap, send-messageconnector.edit

Per-role implicit grants (mirrors the seed migration):

RoleConnector access
adminall (via the role == 'admin' shortcut)
designerconnector.{read,edit}
analystconnector.read
viewernone

The Connectors page sits next to Datasets in the Data section of the sidebar — it moved out of Administration in v0.0.55. Connectors are a data construct, not pure admin.

Connectors remain project-level resources with no per-connector ACL beyond the permission strings above. Anyone with connector.read can see the list and reference connectors when building datasets. Sharing of the data flows through datasets — see Users & Groups — not connectors.

The connector is the credential. The dataset is the unit of access control.

Credential vault

Recognized secret fields (password, api_key, auth_token, service-account JSON, etc.) are encrypted at rest using AES-GCM into a sibling column (connection_secrets_enc), separate from the data_connectors.config JSONB. The encryption key is derived from the platform's JWT secret, so no new operator env is required.

The vault covers postgres / mysql / oracle / snowflake / bigquery / mongo / s3 / oss / gcs / twilio / openai_llm / anthropic_llm / n8n and others. Unknown connector types passthrough — Honeyframe never silently encrypts the wrong fields.

Operator action on upgrade. Existing connectors carry their secrets in plaintext JSONB until you backfill them. Run once per tenant after deploy:

python3 paas/scripts/migrations/2026-05-04_encrypt_connector_secrets.py --dry-run # preview
python3 paas/scripts/migrations/2026-05-04_encrypt_connector_secrets.py # apply

The script is idempotent — re-runs skip already-migrated rows. Until you backfill, plaintext-config rows still work via passthrough fall-through, so the migration is non-breaking.

API reference

EndpointDescription
GET /api/connectors/catalogAvailable connector types with config schemas. Use this to render a creation form.
GET /api/connectorsList active connectors in the project.
GET /api/connectors/{id}One connector's full config (sensitive fields are returned masked).
POST /api/connectorsCreate.
POST /api/connectors/bootstrapCreate + attach a schedule in one call.
PATCH /api/connectors/{id}Update name, description, config, or schedule.
DELETE /api/connectors/{id}Remove. Datasets that reference the connector keep their cached schemas but can no longer sync.
POST /api/connectors/{id}/testVerify connectivity without saving. Returns {ok, latency_ms, error?}.
GET /api/connectors/{id}/modelsDiscover tables/collections the connector exposes — used by the dataset browser.

Authentication uses the same JWT format as the rest of the API — see Authentication under the Developer property.

Adding a new connector type

  1. Create paas/backend/connectors/<name>.py and subclass the appropriate base (SQLBase, StorageBase, LLMBase, VectorStoreBase, or BaseConnector for a one-off).
  2. Implement the required methods (test, read_schema, read_rows, etc. — pattern off the existing connectors).
  3. Register the class in paas/backend/connectors/registry.py. Set is_queryable=False if it should not appear in the dataset browser.
  4. Add a config schema entry; the catalog endpoint serves it directly to the frontend, so no separate frontend form code is needed.
  5. Add a unit test covering test() and one read or write path.

The connector framework does not require a server restart for new types — registry registration happens at module import time, which means the next process start picks the new type up.

Connection pooling

SQL connectors maintain per-process connection pools. The defaults are conservative (5 idle, 20 max) and tuned for the platform's mostly-read workload. Tune on a per-connector basis via the connector config:

{
"pool_size": 10,
"max_overflow": 30,
"pool_recycle": 1800
}

Object-storage and HTTP connectors use the underlying SDK's pooling — typically a per-thread session.