Security & Safety

gptcgt is designed with the principle that AI agents should never be able to harm your system. Multiple independent safety mechanisms work together to ensure the AI stays within boundaries.

Workspace Sandboxing

Every file operation in gptcgt goes through the Workspace security boundary. This is a singleton gatekeeper that:

Resolves all paths — Including symlinks and ../ traversals
Rejects escapes — Any path outside your project root raises a WorkspaceEscapeError
Logs violations — Attempted escapes are logged at CRITICAL level

This means the AI literally cannot read, write, or delete files outside your project directory. No exceptions.

Code Security Scanning

Every code change the AI generates is automatically scanned before being presented to you. Three scanning layers run in sequence:

Layer 1 — Custom Regex (Instant)

Built-in patterns catch common security issues immediately, no external tools needed:

SQL injection via f-strings, .format(), or string concatenation (CWE-89)
Command injection via os.system() or subprocess.run(shell=True) (CWE-78)
Cross-site scripting via innerHTML or dangerouslySetInnerHTML (CWE-79)
Hardcoded API keys, passwords, and secrets (CWE-798)
Path traversal via open(user_input) (CWE-22)
Use of eval(), exec(), or pickle.loads() (CWE-502)
Weak cryptography (md5, sha1, DES)

Layer 2 — Semgrep (2-5 seconds)

If Semgrep is installed, OWASP Top 10 and language-specific rules are run against the changed files.

Layer 3 — Language Scanners

Bandit for Python, ESLint security plugins for JavaScript, and similar tools are invoked if available.

Security Badge System

Every change gets a badge:

🟢 CLEAN — No security issues detected. Proceed normally.
🟡 WARNING — Potential issues found. Details shown, you decide whether to apply.
🔴 BLOCKED — Critical vulnerability. The AI is asked to auto-fix before presenting to you.

Auto-Fix Flow for BLOCKED Changes

Critical issue found → the finding is sent back to the AI with a targeted fix request
AI generates a fixed version → re-scan
If clean → present with “Security issue auto-fixed” note
If still blocked after 2 attempts → present with RED warning, you manually acknowledge

E2B Sandbox Execution

Tests and verification run in isolated Firecracker microVMs via E2B (the same technology behind AWS Lambda). This means:

Test code cannot access your file system
150ms cold start, $0.083/hr (~$0.002 per verification)
Pre-built templates for Python, TypeScript, Rust, Go

If E2B is not configured (no API key), the system falls back to local-only verification: syntax checks via tree-sitter and your local linter.

LSP Cross-File Reference Verification

After an AI agent renames a function or modifies a symbol, the LSP client checks that all references across the project have been updated. This catches the #1 complaint about AI coding tools: broken multi-file edits.

Example:
  Agent renames process_payment() in payments.py
  → LSP finds references in orders.py:42, checkout.py:18, tests/test_payments.py:7
  → Checks: did the agent update all three?
  → orders.py:42 — UPDATED ✓
  → checkout.py:18 — MISSED ✗
  → Result: incomplete, missed reference at checkout.py:18

Sensitive Data Protection

All log output (file logs, debug logs, UI) passes through a SensitiveDataFilter that scrubs:

OpenAI keys (sk-...)
Anthropic keys (sk-ant-...)
Google keys (AIza...)
xAI keys (xai-...)
Groq keys (gsk_...)
Generic bearer tokens and passwords

This filter is applied to every log handler, including traceback output. Even if an error occurs while processing your API key, it won't appear in the log files.

Crash Recovery

If gptcgt exits unexpectedly:

A PID-locked running.lock detects crashes vs. concurrent instances
Application state is atomically saved via temp file + rename (prevents corruption)
Unapplied diffs are backed up and can be restored on next launch
Signal handlers (SIGTERM, SIGINT) ensure clean shutdown when killed

Content Filtering (Proxy)

For Managed Credit users, requests pass through a server-side content filter that blocks:

Prompt injection attempts (“ignore previous instructions”, “enter god mode”)
Harmful content requests
Credential extraction attempts