Costs & Billing

gptcgt is engineered for absolute financial transparency. You always know exactly what you're spending, and there are 5 independent safety layers to prevent accidental overspend.

Live Cost Tracking

At the bottom of the chat panel, a live cost summary updates after every task. The orchestrator computes exact input/output token counts against current provider pricing. You see:

  • Model used and provider
  • Input tokens and output tokens
  • Exact cost in USD
  • Running session total

Dynamic Price Syncing

On startup, gptcgt fetches the latest model pricing from LiteLLM's pricing database (with a 1.5-second timeout). This means your cost calculations always reflect current provider rates, not stale data from the last release.

The 5 Safety Layers

Every one of these layers works independently. Even if one fails, the others catch it.

Layer 1 — Per-Task Limits (Config)

Hard limits on any individual task:

# ~/.gptcgt/global.toml
max_spend_per_task = 2.0      # $2 max per individual task
max_tokens_per_task = 500000  # 500K token cap
daily_spend_limit = 10.0      # $10/day hard stop (BYOK)

Layer 2 — Autonomous Loop Budget Guard

In autonomous mode, the total spend is checked before every subtask. If the cumulative cost exceeds the budget, execution pauses and notifies you.

Layer 3 — Credit Check (Managed)

Before every task using Managed Credits, the system performs an atomic credit check:

  • Checks if you have enough credits for the selected mode
  • If not, suggests a cheaper mode you can afford
  • Uses SELECT ... FOR UPDATE row locking to prevent race conditions

Layer 4 — Spending Caps (Managed)

Server-side enforcement with graduated warnings:

  • 80% used — Yellow warning indicator
  • 95% used — Orange critical indicator
  • 100% used — Red: all API requests blocked

Email warnings are sent (max once per 24 hours) when your cap is hit.

Layer 5 — Smart Model Routing

The router automatically selects cheaper models for simple tasks. If you ask a quick question (complexity 1-3), it routes to a fast, cheap model instead of burning tokens on GPT-4 or Claude Opus.

Credit System (Managed Mode)

If using a gptcgt subscription, costs are abstracted into credits. Roughly, 1 Credit ≈ $0.01 of blended compute.

  • Monthly allowance — Credits reset each billing cycle
  • Overage protection — Configurable: either halt at 0 or allow pay-as-you-go overage
  • Auto-downgrade — If enabled, the system suggests Scout mode instead of blocking when credits run low
  • PAYG top-ups — Purchase additional credits that never expire

BYOK Cost Tracking

With your own API keys, gptcgt tracks costs locally but doesn't manage billing. You control spend through the config-level limits (Layer 1) and monitor via the live cost display. Routing telemetry is stored in .gptcgt/memory.json.