Hands On System Design Course - Code Everyday

Hands On System Design Course - Code Everyday

Day 104: Building Cost Allocation and Usage Reporting

Making Every Log Dollar Count in Your Distributed Platform

SystemDR's avatar
SystemDR
Sep 22, 2025
∙ Paid
2
Share

Working Code Demo:

What We're Building Today

Today's mission transforms yesterday's metrics collection into actionable financial intelligence. You'll build a cost allocation system that tracks resource consumption per tenant, calculates actual infrastructure costs, and generates detailed usage reports. Think Netflix's internal cost tracking for different service teams or Stripe's precise resource attribution across customer segments.

Key Deliverables:

  • Real-time resource usage tracking by tenant/user

  • Cost attribution engine with configurable pricing models

  • Interactive reporting dashboard with drill-down capabilities

  • Historical usage trends and optimization recommendations

  • Automated billing reports and cost alerts


Core Concepts: The Economics of Distributed Systems

Resource Attribution Challenge

Multi-tenant log processing systems face a complex question: which team or customer should pay for each gigabyte stored, each query executed, or each alert generated? Unlike simple web hosting, distributed systems create shared resource pools where attribution requires sophisticated tracking.

Cost Modeling Approaches

Direct Cost Attribution: Links specific resources (CPU cores, storage volumes) directly to tenants. Simple but often inaccurate for shared infrastructure.

Proportional Allocation: Distributes shared costs based on usage ratios. More fair but requires careful metric selection to avoid gaming.

Activity-Based Costing: Assigns costs based on actual activities (log ingestion, search queries, retention periods). Most accurate but computationally intensive.

Financial Visibility Impact

Cost transparency drives behavior changes. Teams optimize log verbosity when they see storage costs. Developers tune query patterns when CPU attribution becomes visible. Financial feedback creates natural resource optimization pressure.


Context in Distributed Systems

Real-World Applications

AWS CloudWatch Logs charges per gigabyte ingested and stored, with separate pricing for queries. Your system replicates this model internally.

Datadog bills based on log volume and retention, providing detailed usage breakdowns. Enterprise customers need similar visibility for internal cost allocation.

Splunk uses complex licensing models based on daily ingestion volumes. Large organizations require granular usage tracking to manage licensing costs effectively.

System Integration Points

Your cost allocation system sits between yesterday's metrics collection and next week's optimization engines. It consumes resource utilization data, applies cost models, and produces reports that drive infrastructure decisions.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 System Design Course
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture