How Costs Work¶
This page explains the complete cost lifecycle: how raw billing data enters the system, how costs are computed, how they're allocated across identities, and how rounding is handled. Every formula shown here maps directly to code in the engine.
Two billing paradigms¶
The engine supports two fundamentally different ways of obtaining cost data.
API-billed (Confluent Cloud)¶
The vendor has already computed the bill. Each billing line item arrives with
quantity, unit_price, and total_cost pre-calculated. The engine's job
is purely to allocate that cost across identities — it never modifies the
total.
Vendor Billing API
→ BillingLineItem(product_type="KAFKA_NUM_CKU", total_cost=$100.00)
→ Engine allocates $100.00 across identities
→ Sum of all ChargebackRows = $100.00 (guaranteed)
Constructed (Self-Managed Kafka, Generic Metrics)¶
There is no billing API. The engine constructs billing lines by:
- Querying Prometheus for usage metrics
- Computing quantities from those metrics
- Multiplying by rates you define in YAML
Prometheus + YAML rates
→ Engine computes: quantity × rate = total_cost
→ BillingLineItem(product_type="SELF_KAFKA_COMPUTE", total_cost=$36.00)
→ Engine allocates $36.00 across identities
Constructed cost math¶
Compute costs (fixed quantity)¶
Compute is a fixed cost based on the number of brokers (or instances). There is no Prometheus query — the quantity comes from your config.
| Input | Value |
|---|---|
broker_count |
3 |
compute_hourly_rate |
$0.50 |
One BillingLineItem per day with product_type = SELF_KAFKA_COMPUTE.
Storage costs (gauge metric, averaged)¶
Storage is measured as a point-in-time value (a Prometheus gauge). The engine
queries kafka_log_log_size across the day, averages all samples, converts
bytes to GiB, and charges per GiB-hour.
avg_bytes = mean(all prometheus samples for the day)
avg_gib = avg_bytes ÷ 1,073,741,824
daily_cost = avg_gib × 24 hours × storage_per_gib_hourly
| Input | Value |
|---|---|
| Prometheus samples (3 in this example) | 107,374,182,400 bytes (100 GiB), 112,742,891,520 bytes (105 GiB), 107,374,182,400 bytes (100 GiB) |
storage_per_gib_hourly |
$0.0001 |
avg_bytes = (107,374,182,400 + 112,742,891,520 + 107,374,182,400) ÷ 3
= 109,163,752,107 bytes
avg_gib = 109,163,752,107 ÷ 1,073,741,824 = 101.67 GiB
daily_cost = 101.67 × 24 × $0.0001 = $0.2440
Why average, not sum?
Storage is not cumulative. A disk holding 100 GiB at hour 0 and 100 GiB at hour 1 does not mean you used 200 GiB — you used 100 GiB for 2 hours. Averaging across samples gives the representative GiB held over the day.
Network costs (counter metric, summed)¶
Network bytes are cumulative counters. The engine queries
kafka_server_brokertopicmetrics_bytesin_total (and bytesout) using
increase() over 1-hour windows, then sums all the hourly deltas to get the
total bytes transferred in a day.
total_bytes = sum(all hourly increase values for the day)
total_gib = total_bytes ÷ 1,073,741,824
daily_cost = total_gib × network_per_gib
| Input | Value |
|---|---|
| Hourly increases (24 values) | Sum = 53,687,091,200 bytes |
network_ingress_per_gib |
$0.01 |
Why sum, not average?
Network bytes are a running total. increase() gives bytes transferred in
each hour. Summing gives total daily transfer. This is the actual volume
that crossed the network.
Generic metrics: the three quantity types¶
The generic plugin generalizes the above patterns:
| Quantity type | Math | Rate unit |
|---|---|---|
fixed |
count × 24 × rate |
$/instance/hour |
storage_gib |
avg(query) ÷ 2^30 × 24 × rate |
$/GiB/hour |
network_gib |
sum(increase(query)) ÷ 2^30 × rate |
$/GiB |
Cost allocation¶
Once a billing line exists (either from the vendor API or constructed), the engine allocates it across identities. This section explains the strategies and the fallback chain.
Even split¶
Divides the cost equally among a set of identities. Used for shared infrastructure costs where no per-identity usage metric exists.
Rounding: All amounts are quantized to 4 decimal places ($0.0001). After dividing, the engine distributes any rounding remainder one unit at a time across the leading identities to guarantee the sum equals the total.
Example: $10.00 across 3 identities
base = $10.0000 ÷ 3 = $3.3333 (quantized to 4 decimals)
sum = $3.3333 × 3 = $9.9999
diff = $10.0000 - $9.9999 = $0.0001
Final:
identity-a: $3.3334 (+$0.0001 remainder)
identity-b: $3.3333
identity-c: $3.3333
total: $10.0000 ✓
Usage ratio¶
Divides the cost proportionally to a usage metric (e.g., bytes produced). Used for network costs and other usage-driven costs.
Rounding: Same approach — quantize each amount, then distribute remainder.
Example: $100.00 split by bytes
| Identity | Bytes | Ratio | Raw amount | After rounding |
|---|---|---|---|---|
| team-a | 500 GiB | 0.500 | $50.0000 | $50.0000 |
| team-b | 300 GiB | 0.300 | $30.0000 | $30.0000 |
| team-c | 200 GiB | 0.200 | $20.0000 | $20.0000 |
| Total | 1000 GiB | 1.000 | $100.0000 | $100.0000 |
When ratios don't divide evenly:
| Identity | Bytes | Ratio | Raw amount | Quantized | Adjusted |
|---|---|---|---|---|---|
| team-a | 333 | 0.3333... | $33.3333... | $33.3333 | $33.3334 |
| team-b | 333 | 0.3333... | $33.3333... | $33.3333 | $33.3333 |
| team-c | 334 | 0.3340... | $33.4000... | $33.3334 | $33.3333 |
| Total | 1000 | $100.0000 | $99.9999 | $100.0000 |
The $0.0001 remainder goes to the first identity. The sum always equals the input.
The allocation chain (CostType tagging)¶
Every chargeback row carries a cost_type field: either USAGE or SHARED.
This isn't just a label — it indicates why the cost was allocated that way.
- USAGE — Allocated based on actual measured consumption (bytes, CFUs, queries)
- SHARED — Allocated because no per-identity usage metric was available, so the cost was spread evenly as a shared infrastructure charge
Tiered fallback (ChainModel)¶
Most allocators use a tiered fallback chain. The engine tries each tier in order and uses the first one that succeeds (has data to work with).
Tier 0: Usage ratio (metrics available?)
↓ no metrics
Tier 1: Even split across active identities (identities known?)
↓ no active identities
Tier 2: Even split across all tenant identities for the period
↓ no identities at all
Tier 3: Terminal — allocate to the resource ID itself
Each chargeback row records which tier was used in the allocation_detail field.
Common values:
| Detail | Meaning |
|---|---|
USAGE_RATIO_ALLOCATION |
Tier 0 succeeded — proportional allocation by metrics |
NO_METRICS_LOCATED |
Tier 0 failed — no metrics available, fell back to even split |
NO_ACTIVE_IDENTITIES_LOCATED |
Tier 1 failed — no active identities, used period-wide set |
NO_IDENTITIES_LOCATED |
All tiers failed — allocated to resource or UNALLOCATED |
NO_USAGE_FOR_ACTIVE_IDENTITIES |
Metrics exist but all values are zero |
Composite allocation (CKU model)¶
Some costs use a hybrid model that splits the cost into portions before allocating. The Kafka CKU model is the primary example:
Total CKU cost: $100.00
├── 70% ($70.00) → Usage ratio chain (bytes in + bytes out)
│ ├── Tier 0: proportional by bytes → USAGE rows
│ ├── Tier 1: even split (active) → SHARED rows
│ └── ...
└── 30% ($30.00) → Shared chain (even split)
├── Tier 0: even split (active) → SHARED rows
└── ...
The ratios are configurable via allocator_params.kafka_cku_usage_ratio and
kafka_cku_shared_ratio. Each portion runs its own fallback chain independently.
Example output (3 identities, $100 CKU cost, 70/30 split):
| Identity | Usage portion (70%) | Shared portion (30%) | Total |
|---|---|---|---|
| team-a (50% of bytes) | $35.00 (USAGE) | $10.00 (SHARED) | $45.00 |
| team-b (30% of bytes) | $21.00 (USAGE) | $10.00 (SHARED) | $31.00 |
| team-c (20% of bytes) | $14.00 (USAGE) | $10.00 (SHARED) | $24.00 |
| Total | $70.00 | $30.00 | $100.00 |
Each row's metadata includes composition_index (0 = usage portion, 1 = shared
portion) and composition_ratio (0.70 or 0.30) for auditability.
The UNALLOCATED identity¶
When the engine cannot attribute a cost to any known identity, it allocates to
a synthetic identity called UNALLOCATED with identity_type = "system".
This happens when:
- No identities exist for a resource in any scope (resource_active, metrics_derived, tenant_period)
- Allocation fails after retries — the allocation retry limit was exceeded
- Org-wide costs — costs like
AUDIT_LOG_READandSUPPORTthat apply to the entire organization, not a specific resource
UNALLOCATED is automatically excluded from tenant-period identity sets, so it never receives costs via even-split. It only receives costs through the terminal tier of a chain or through explicit org-wide allocation.
UNALLOCATED is not an error
For org-wide costs, allocating to UNALLOCATED is the correct behavior. For
resource-specific costs, it indicates missing identity data — check your
identity source configuration and the allocation_detail field on the
chargeback rows.
Active fraction adjustment¶
Before allocation, costs are adjusted for resource lifecycle. If a resource was
only active for part of the billing window (e.g., it was created mid-day or
deleted mid-day), the engine computes an active_fraction and adjusts the
cost proportionally.
A resource created at noon on a daily billing window has active_fraction = 0.5,
so only half the daily cost is allocated. The remaining half stays with the
billing line (it was incurred but not attributable to this resource for the
full window).
Precision guarantees¶
All cost arithmetic uses Python Decimal with 4-decimal precision ($0.0001).
Floating-point is never used for money.
Guarantees:
-
Sum-to-total — The sum of all ChargebackRows for a billing line always equals the input
split_amount(which istotal_cost × active_fraction). Rounding remainders are distributed deterministically. -
Deterministic — Given the same inputs (billing lines, identities, metrics), allocation produces identical output. The remainder distribution is order-dependent (first identity gets the extra cent), and identity lists are sorted before allocation.
-
Auditable — Every row includes
allocation_method(even_split, usage_ratio, terminal),allocation_detail(which tier fired), andmetadata(chain_tier, composition_index, composition_ratio, usage ratio per identity).