Confluent Cloud Configuration Reference¶
New to Confluent Cloud configuration?
Read the Configuration Guide first for a walkthrough of the decisions you'll make, then come back here for the full field reference.
ecosystem key¶
Full example¶
tenants:
my-ccloud-org:
ecosystem: confluent_cloud
tenant_id: my-ccloud-org # internal partition key (not the CCloud org ID)
lookback_days: 200
cutoff_days: 5
retention_days: 250
storage:
connection_string: "sqlite:///data/ccloud.db"
plugin_settings:
ccloud_api:
key: ${CCLOUD_API_KEY}
secret: ${CCLOUD_API_SECRET}
billing_api:
days_per_query: 15
metrics:
type: prometheus
url: https://api.telemetry.confluent.cloud
auth_type: basic
username: ${METRICS_API_KEY}
password: ${METRICS_API_SECRET}
flink:
- region_id: us-east-1
key: ${FLINK_API_KEY}
secret: ${FLINK_API_SECRET}
emitters:
- type: csv
aggregation: daily
params:
output_dir: ./output
chargeback_granularity: daily
TenantConfig fields¶
| Field | Type | Default | Description |
|---|---|---|---|
ecosystem |
string | required | Must be confluent_cloud |
tenant_id |
string | required | Unique partition key for DB records. Can be any string (e.g. prod, acme-corp). This is not your Confluent Cloud Organization ID — it is an internal label used to isolate data across tenants in the database. |
lookback_days |
int | 200 | Days of billing history to fetch (max 364). Must be > cutoff_days. |
cutoff_days |
int | 5 | Skip dates within this many days of today (billing lag, max 30) |
retention_days |
int | 250 | Delete data older than this (max 730) |
allocation_retry_limit |
int | 3 | Max identity resolution retries before fallback (max 10) |
gather_failure_threshold |
int | 5 | Consecutive gather failures before tenant suspension |
tenant_execution_timeout_seconds |
int | 3600 | Per-tenant run timeout (0 = no timeout) |
metrics_prefetch_workers |
int | 4 | Parallel metrics query threads (1–20) |
zero_gather_deletion_threshold |
int | -1 | Mark resources deleted after N zero-gather cycles (-1 = disabled) |
plugin_settings fields (CCloud)¶
| Field | Type | Default | Description |
|---|---|---|---|
ccloud_api.key |
string | required | CCloud API key |
ccloud_api.secret |
secret | required | CCloud API secret |
billing_api.days_per_query |
int | 15 | Days per billing API request (max 30) |
metrics.url |
string | optional | Prometheus/Telemetry API URL |
metrics.auth_type |
enum | none |
basic, bearer, or none |
metrics.username |
string | optional | For auth_type: basic |
metrics.password |
secret | optional | For auth_type: basic |
metrics.bearer_token |
secret | optional | For auth_type: bearer |
flink |
list | optional | Per-region Flink API credentials |
chargeback_granularity |
enum | daily |
hourly, daily, or monthly |
metrics_step_seconds |
int | 3600 | Prometheus query step (lower = finer granularity) |
min_refresh_gap_seconds |
int | 1800 | Minimum time between pipeline runs for this tenant |
Handled product types¶
Each product type from the CCloud billing API is routed to a handler that knows how to resolve identities and allocate costs for that service. The allocation strategy reflects the nature of the cost — usage-driven costs are split by measured consumption, shared costs are split evenly.
| Handler | Product types | Allocation strategy | Why |
|---|---|---|---|
kafka |
KAFKA_NUM_CKU, KAFKA_NUM_CKUS |
Hybrid: 70% usage ratio (bytes), 30% even split | CKUs are the main Kafka compute cost. Part of the cost is driven by traffic volume (usage), part is base infrastructure overhead (shared). The 70/30 default is configurable via allocator_params. |
kafka |
KAFKA_NETWORK_READ, KAFKA_NETWORK_WRITE |
Usage ratio (bytes in/out per principal) | Network transfer is directly attributable to the principal that produced or consumed the data. Requires Telemetry API metrics. |
kafka |
KAFKA_BASE, KAFKA_PARTITION, KAFKA_STORAGE |
Even split | Base fees, partition counts, and storage are cluster-level costs with no per-principal usage metric. |
schema_registry |
SCHEMA_REGISTRY, GOVERNANCE_BASE, NUM_RULES |
Even split | Schema Registry is a shared service — all principals benefit equally from schema validation. |
connector |
CONNECT_CAPACITY, CONNECT_NUM_TASKS, CONNECT_THROUGHPUT, CUSTOM_CONNECT_* |
Even split per connector | Connectors are typically owned by teams. Costs are split among identities active on the connector's resource. |
ksqldb |
KSQL_NUM_CSU, KSQL_NUM_CSUS |
Even split | ksqlDB compute units are application-level — split across active identities. |
flink |
FLINK_NUM_CFU, FLINK_NUM_CFUS |
Usage ratio by statement owner CFU consumption | Flink CFU costs are directly traceable to the user who created the SQL statement. Uses a 4-tier chain: statement owner → active identities → period identities → resource. |
org_wide |
AUDIT_LOG_READ, SUPPORT |
Even split across tenant, then to UNALLOCATED | Org-wide costs have no resource or principal — they apply to the whole organization. |
default |
TABLEFLOW_* |
Shared (to resource) | New product types without a dedicated handler fall back to resource-level allocation. |
default |
CLUSTER_LINKING_* |
Usage (to resource) | Cluster linking costs are attributed to the linked resource. |
Unknown product types are allocated to UNALLOCATED. Check the allocation_detail
field on chargeback rows to understand which fallback tier was used.
See How Costs Work for the complete allocation model including the fallback chain and composite CKU allocation.
Allocator params¶
Override default allocation ratios for Kafka CKU costs:
allocator_params:
kafka_cku_usage_ratio: 0.70 # fraction allocated by bytes (default 0.70)
kafka_cku_shared_ratio: 0.30 # fraction allocated evenly (default 0.30)
kafka_cku_usage_ratio + kafka_cku_shared_ratio must sum to 1.0 (tolerance: 0.0001). Startup fails if they don't.
How to think about the ratio
The usage portion is allocated proportionally to bytes_in + bytes_out per
principal. The shared portion is split evenly across all active identities.
- High usage ratio (0.90/0.10): Heavy producers/consumers pay proportionally more. Good when your cluster is right-sized and traffic volume drives cost.
- Balanced (0.70/0.30): Default. Acknowledges that the cluster has a base cost regardless of traffic.
- High shared ratio (0.50/0.50): Spreads cost more evenly. Good when the cluster is over-provisioned and most cost is fixed overhead.
If metrics are unavailable for a billing window, the usage portion falls back to even-split anyway — so at 1.0/0.0, you effectively get even-split when Telemetry API data is missing. See How Costs Work for a worked example.
Emitters¶
emitters:
- type: csv
aggregation: daily # rows aggregated to daily before writing
params:
output_dir: /data/csv
filename_template: "{tenant_id}_{date}.csv"
aggregation options: null (as-is), hourly, daily, monthly.