Costs & Billing

gptcgt is engineered for absolute financial transparency. You always know exactly what you're spending, and there are 5 independent safety layers to prevent accidental overspend.

Live Cost Tracking

At the bottom of the chat panel, a live cost summary updates after every task. The orchestrator computes exact input/output token counts against current provider pricing. You see:

Model used and provider
Input tokens and output tokens
Exact cost in USD
Running session total

Dynamic Price Syncing

On startup, gptcgt fetches the latest model pricing from LiteLLM's pricing database (with a 1.5-second timeout). This means your cost calculations always reflect current provider rates, not stale data from the last release.

The 5 Safety Layers

Every one of these layers works independently. Even if one fails, the others catch it.

Layer 1 — Per-Task Limits (Config)

Hard limits on any individual task:

# ~/.gptcgt/global.toml
max_spend_per_task = 2.0      # $2 max per individual task
max_tokens_per_task = 500000  # 500K token cap
daily_spend_limit = 10.0      # $10/day hard stop (BYOK)

Layer 2 — Autonomous Loop Budget Guard

In autonomous mode, the total spend is checked before every subtask. If the cumulative cost exceeds the budget, execution pauses and notifies you.

Layer 3 — Credit Check (Managed)

Before every task using Managed Credits, the system performs an atomic credit check:

Checks if you have enough credits for the selected mode
If not, suggests a cheaper mode you can afford
Uses SELECT ... FOR UPDATE row locking to prevent race conditions

Layer 4 — Spending Caps (Managed)

Server-side enforcement with graduated warnings:

80% used — Yellow warning indicator
95% used — Orange critical indicator
100% used — Red: all API requests blocked

Email warnings are sent (max once per 24 hours) when your cap is hit.

Layer 5 — Smart Model Routing

The router automatically selects cheaper models for simple tasks. If you ask a quick question (complexity 1-3), it routes to a fast, cheap model instead of burning tokens on GPT-4 or Claude Opus.

Credit System (Managed Mode)

If using a gptcgt subscription, costs are abstracted into credits. Roughly, 1 Credit ≈ $0.01 of blended compute.

Monthly allowance — Credits reset each billing cycle
Overage protection — Configurable: either halt at 0 or allow pay-as-you-go overage
Auto-downgrade — If enabled, the system suggests Scout mode instead of blocking when credits run low
PAYG top-ups — Purchase additional credits that never expire

BYOK Cost Tracking

With your own API keys, gptcgt tracks costs locally but doesn't manage billing. You control spend through the config-level limits (Layer 1) and monitor via the live cost display. Routing telemetry is stored in .gptcgt/memory.json.