Costs & Billing
gptcgt is engineered for absolute financial transparency. You always know exactly what you're spending, and there are 5 independent safety layers to prevent accidental overspend.
Live Cost Tracking
At the bottom of the chat panel, a live cost summary updates after every task. The orchestrator computes exact input/output token counts against current provider pricing. You see:
- Model used and provider
- Input tokens and output tokens
- Exact cost in USD
- Running session total
Dynamic Price Syncing
On startup, gptcgt fetches the latest model pricing from LiteLLM's pricing database (with a 1.5-second timeout). This means your cost calculations always reflect current provider rates, not stale data from the last release.
The 5 Safety Layers
Every one of these layers works independently. Even if one fails, the others catch it.
Layer 1 — Per-Task Limits (Config)
Hard limits on any individual task:
# ~/.gptcgt/global.toml
max_spend_per_task = 2.0 # $2 max per individual task
max_tokens_per_task = 500000 # 500K token cap
daily_spend_limit = 10.0 # $10/day hard stop (BYOK)Layer 2 — Autonomous Loop Budget Guard
In autonomous mode, the total spend is checked before every subtask. If the cumulative cost exceeds the budget, execution pauses and notifies you.
Layer 3 — Credit Check (Managed)
Before every task using Managed Credits, the system performs an atomic credit check:
- Checks if you have enough credits for the selected mode
- If not, suggests a cheaper mode you can afford
- Uses
SELECT ... FOR UPDATErow locking to prevent race conditions
Layer 4 — Spending Caps (Managed)
Server-side enforcement with graduated warnings:
- 80% used — Yellow warning indicator
- 95% used — Orange critical indicator
- 100% used — Red: all API requests blocked
Email warnings are sent (max once per 24 hours) when your cap is hit.
Layer 5 — Smart Model Routing
The router automatically selects cheaper models for simple tasks. If you ask a quick question (complexity 1-3), it routes to a fast, cheap model instead of burning tokens on GPT-4 or Claude Opus.
Credit System (Managed Mode)
If using a gptcgt subscription, costs are abstracted into credits. Roughly, 1 Credit ≈ $0.01 of blended compute.
- Monthly allowance — Credits reset each billing cycle
- Overage protection — Configurable: either halt at 0 or allow pay-as-you-go overage
- Auto-downgrade — If enabled, the system suggests Scout mode instead of blocking when credits run low
- PAYG top-ups — Purchase additional credits that never expire
BYOK Cost Tracking
With your own API keys, gptcgt tracks costs locally but doesn't manage billing. You control spend through the config-level limits (Layer 1) and monitor via the live cost display. Routing telemetry is stored in .gptcgt/memory.json.