Autonomous Mode
Autonomous mode lets you give gptcgt a high-level goal — like “Build a REST API for user management” — and walk away. The system plans, implements, tests, and iterates without human intervention, pausing only when it needs your input or reaches a safety boundary.
How It Works
- Plan Generation — The orchestrator drafts a project plan with numbered subtasks, stored in
.gptcgt/phase.md - Subtask Execution Loop — Each subtask flows through the full DAG pipeline: intent analysis → context gathering → model routing → code generation → testing → arbiter verification
- Self-Healing — If a test fails, the TesterAgent regenerates the code (up to 3 attempts) using the failure output as feedback
- Phase Tracking — Progress updates
.gptcgt/phase.mdin real-time so the AI always knows where it is - Completion Summary — When all subtasks finish, you get a summary of what was done, what failed, and what it cost
Safety Boundaries
Autonomous mode has 4 independent safety checks that prevent runaway execution:
1. Budget Guard
Before every subtask, the system checks total spend against your configured limits. If spent exceeds the budget, execution pauses immediately.
# In ~/.gptcgt/global.toml
max_spend_per_task = 2.0 # $2 max per individual task
daily_spend_limit = 10.0 # $10/day hard stop (BYOK mode)2. Iteration Cap
A hard ceiling on how many subtasks the autonomous loop will execute before pausing for your review.
# In ~/.gptcgt/global.toml
max_autonomous_iterations = 503. Token Cap
Each individual task is capped at 500,000 tokens by default. This prevents accidental context window explosions when processing large files.
max_tokens_per_task = 5000004. User Cancellation
Press Ctrl+C or Escape at any time. The system cancels the current subtask gracefully and preserves all work done so far.
Agent Communication
In autonomous mode, multiple agents collaborate via a PubSub message bus:
- Orchestrator — Plans and coordinates the overall workflow
- Coder Agent — Generates code changes as unified diffs
- Tester Agent — Generates and runs tests in an isolated sandbox
- Arbiter — Scores and validates each change before approval
Agents communicate through structured AgentMessage objects with type, sender, recipient, and payload. Messages are logged in the bottom-left log panel so you can see exactly what each agent is doing.
Agent Memory
The system maintains memory across sessions:
.gptcgt/phase.md— Project file map with line counts, modification dates, and development phases.gptcgt/project.md— Auto-detected tech stack summary (language, framework, test runner, linter).gptcgt/memory.json— Telemetry entries recording which models were used, costs, and success rates.gptcgt/agents/tester.md— The TesterAgent's memory file of failure patterns it has learned from
This memory prevents the agents from repeating the same mistakes and helps them understand the project's structure without needing to re-analyze every time.
Crash Recovery
If gptcgt crashes mid-autonomous-run, your work isn't lost:
- A PID-locked
running.lockfile detects the crash on next startup - State is auto-saved atomically to
.gptcgt/recovery/state.json - Unapplied diffs are backed up to
.gptcgt/recovery/diffs/ - On restart, gptcgt offers to resume from where it left off
Best Practices
- Always commit before starting — Run
git commitso you have a clean rollback point - Be specific with your goal — “Build a user registration system with email verification and password reset” works better than “build auth”
- Start with smaller iteration caps while you learn the system's behavior
- Review the phase.md after an autonomous run to understand what was done