Security posture, roadmap, and how we think about agents that can take real-world actions.
A New Era in Computing Security
For the past 20 years, security models have been built around locking devices and applications down - setting boundaries between inter-process communications, separating internet from local, sandboxing untrusted code. These principles remain important.
But AI agents represent a fundamental shift.
Unlike traditional software that does exactly what code tells it to do, AI agents interpret natural language and make decisions about actions. They blur the boundary between user intent and machine execution. They can be manipulated through language itself.
We understand that with the great utility of a tool like CoderClaw comes great responsibility. Done wrong, an AI agent is a liability. Done right, we can change personal computing for the better.
This security program exists to get it right.
Context
CoderClaw is an AI agent platform. Unlike chatbots that only generate text, CoderClaw agents can:
- Execute shell commands on the host machine
- Send messages through WhatsApp, Telegram, Discord, Slack, and other channels
- Read and write files in the workspace
- Fetch arbitrary URLs from the internet
- Schedule automated tasks
- Access connected services and APIs
This capability is what makes CoderClaw useful. It's also what makes security critical.
AI agents that can take real-world actions introduce risks that traditional software doesn't have:
- Prompt injection - Malicious users can craft messages that manipulate the AI into performing unintended actions
- Indirect injection - Malicious content in fetched URLs, emails, or documents can hijack agent behavior
- Tool abuse - Even without injection, misconfigured agents can cause damage through overly permissive settings
- Identity risks - Agents can send messages as you, potentially damaging relationships or reputation
These aren't theoretical. They're documented attack patterns that affect all AI agent systems.
Scope
This security program covers the entire CoderClaw ecosystem. Nothing is out of scope.
Core Platform
- CoderClaw CLI and Gateway (
coderclaw) - Agent execution engine
- Tool implementations
- Channel integrations (WhatsApp, Telegram, Discord, Slack, Signal, etc.)
Applications
- macOS desktop application
- iOS mobile application
- Android mobile application
- Web interface
Services
- CoderClaw Skills - Skills marketplace and registry
- Documentation (docs.coderclaw.ai)
- Any hosted infrastructure
Extensions
- Official extensions (
extensions/) - Plugin SDK and third-party plugins
- Skills distributed through CoderClaw Skills
People
- Core maintainers and contributors
- Security processes and response procedures
- Supply chain and dependency management
Program Overview
We're establishing a formal security function with four phases:
Transparency
Develop threat model openly with community contribution
Product Security Roadmap
Define defensive engineering goals and track publicly
Code Review
Manual security review of entire codebase
Security Triage
Formal process for handling vulnerability reports
Phase 1: Transparency
Goal
Develop and publish our threat model openly, inviting community contribution, so users understand the risks and can make informed decisions about their deployments.
Why
Security through obscurity doesn't work. Attackers already know these techniques - they're documented in academic papers, security blogs, and conference talks. What's missing is clear communication to users about:
- What risks exist
- What we're doing about them
- What users should do to protect themselves
By developing the threat model openly, we benefit from collective expertise and build trust through transparency.
Threat Model Coverage
| Category | Risks Covered |
|---|---|
| A. Input Manipulation | Direct prompt injection, indirect injection, tool argument injection, context manipulation |
| B. Auth & Access | AllowFrom bypass, privilege escalation, cross-session access, API key exposure |
| C. Data Security | System prompt disclosure, workspace exposure, memory leakage, data exfiltration |
| D. Infrastructure | SSRF, gateway exposure, dependency vulnerabilities, file permissions |
| E. Operations | Logging sensitive data, insufficient monitoring, resource exhaustion, misconfiguration |
| F. Supply Chain | CoderClaw Skills skills integrity, extension security, dependency vulnerabilities |
Threat Model Scope
| Component | Why It's Included |
|---|---|
| Core platform (CLI, Gateway, agents, tools) | Primary attack surface |
| CoderClaw Skills | Skills marketplace - supply chain risk |
| Mobile apps (iOS, Android) | Agent control interface, credential storage |
| Desktop app (macOS) | Gateway host, system integration |
| Extensions and plugins | Third-party code execution |
| Build and release pipeline | Distribution integrity |
Each risk in the threat model will include description and severity rating, attack examples, current mitigations, known gaps, and user recommendations.
The threat model will be open for community contribution via pull requests.
Phase 2: Product Security Roadmap
Goal
Create a public product security roadmap defining defensive engineering goals, tracked as GitHub issues so the community can follow progress, provide input, and contribute.
Defensive Engineering Goals
| Category | Goal | Description |
|---|---|---|
| Prompt Injection Protection | Input validation | Pattern detection and alerting for injection attempts |
| Tool confirmation | Require explicit approval for sensitive actions | |
| Context isolation | Prevent cross-session contamination | |
| Privacy Enhancements | System prompt protection | Prevent disclosure of system prompts |
| Data minimization | Reduce unnecessary data retention | |
| Audit logging | Clear visibility into agent actions | |
| Access Control | Fine-grained permissions | Per-tool, per-session access controls |
| Rate limiting | Prevent resource exhaustion | |
| Spending controls | Hard limits on API costs | |
| Supply Chain | Skills verification | Integrity checks for CoderClaw Skills skills |
| Dependency auditing | Automated vulnerability scanning | |
| Signed releases | Cryptographic verification of updates |
Specific priority issues will be identified through the Phase 3 code review and added to the public roadmap as they are discovered and triaged.
Phase 3: Code Review
Goal
While the community is already working around the clock to find and address flaws - and we're grateful for every contribution - we recognize this is an open-source project where contributors provide their best effort. Phase 3 is a dedicated, comprehensive security assessment specifically designed to drive out deeply rooted systemic issues and improve overall code quality and user safety.
Scope
The code review covers the entire CoderClaw codebase and ecosystem:
| Area | Path | Why |
|---|---|---|
| Agent execution | src/agents/ | Core attack surface - how agents run |
| Tool implementations | src/agents/tools/ | What agents can do - exec, messaging, web |
| Message processing | src/auto-reply/ | Entry point for all user input |
| Security utilities | src/security/ | Existing security controls |
| Gateway server | src/gateway/ | Network-exposed component |
| Authentication | src/*/auth* | Credential handling, API keys |
| Session management | src/config/sessions.ts | Cross-session isolation |
| Pairing and access control | src/pairing/, src/*/access-control* | DM and group gating |
| External content handling | src/security/external-content.ts | Injection defenses |
| macOS desktop app | apps/macos/ | Gateway host, system integration |
| iOS mobile app | apps/ios/ | Agent control, credential storage |
| Android mobile app | apps/android/ | Agent control, credential storage |
| CoderClaw Skills | coderclaw.ai/skills | Skills registry - supply chain risk |
| Official extensions | extensions/ | First-party plugins |
| Build and release pipeline | CI/CD, scripts | Distribution integrity, signing |
Approach
- Manual code review - Human-in-the-loop analysis of security-critical paths
- Automated scanning - Static analysis, dependency auditing, secret detection
- Dynamic testing - Attempting documented attack patterns against running system
- Architecture review - Evaluating trust boundaries and data flows
Disclosure
- All critical and high findings fixed before public disclosure
- Findings summary published after remediation
- Full report available on request
- CVEs assigned where applicable
Phase 4: Security Triage Function
Goal
Establish a formal process for receiving, triaging, and responding to security vulnerability reports.
Report a Vulnerability
We take security reports seriously. Report vulnerabilities directly to the repository where the issue lives:
- Core CLI and gateway - seanhogg/coderclaw
- macOS desktop app - seanhogg/coderclaw (apps/macos)
- iOS app - seanhogg/coderclaw (apps/ios)
- Android app - seanhogg/coderclaw (apps/android)
- CoderClaw Skills - seanhogg/coderclaw
- Trust and threat model - coderclaw/trust
For issues that don't fit a specific repo, or if you're unsure, email security@coderclaw.ai and we'll route it.
Required in Reports
Reports without reproduction steps, demonstrated impact, and remediation advice will be deprioritized. Given the volume of AI-generated scanner findings, we must ensure we're receiving vetted reports from researchers who understand the issues.
Response SLAs
| Severity | Definition | First Response | Triage | Fix Target |
|---|---|---|---|---|
| Critical | RCE, auth bypass, mass data exposure | 24 hours | 48 hours | 7 days |
| High | Significant impact, single-user scope | 48 hours | 5 days | 30 days |
| Medium | Limited impact, requires specific conditions | 5 days | 14 days | 90 days |
| Low | Minor issues, defense in depth | 14 days | 30 days | Best effort |
Our Commitments
- Acknowledge all complete reports within 48 hours
- Provide status updates at least every 14 days
- Credit researchers in advisories (unless anonymity requested)
- Not pursue legal action against good-faith security research
- Consider bounties for qualifying critical/high findings (case-by-case)
Security & Trust
Sean Hogg leads Security & Trust at CoderClaw.
Sean oversees threat modeling, secure architecture, vulnerability response, and rollout of practical controls such as optional TOTP MFA, recovery codes, session/token revocation, and legal-compliance controls across CoderClaw services.
Responsibilities
- Lead threat modeling and risk assessment
- Scope and oversee code review
- Establish triage process and response procedures
- Review security-critical code changes
- Provide guidance on security architecture decisions
Recent Security Enhancements (Q1 2026)
- Terms/version enforcement: users must accept the currently active Terms version before accessing protected application areas.
- Versioned legal documents: Terms of Use and Privacy Policy are now versioned and published with timestamps.
- Tracked legal acceptance: accepted Terms version is recorded per user and re-consent is required after Terms updates.
- Superadmin legal controls: platform admins can publish new Terms versions and immediately enforce re-acceptance.
- Expanded account defenses: MFA, session visibility/revocation, and token lifecycle controls are integrated across web, tenant, and admin surfaces.
Current Security Posture
CoderClaw already has security controls in place. Understanding what exists helps users configure their deployments appropriately.
Secure by Default
Unknown senders must complete pairing flow with expiring code
Commands not on allowlist are denied by default, user prompted for approval
If not configured, only your own number can DM the agent
Conversations are isolated per session key
Internal IPs and localhost blocked in web_fetch
WebSocket connections must authenticate
Users can enable authenticator-app MFA with QR setup and one-time recovery codes
Users can view active sessions across devices and revoke specific or all other sessions
Issued tokens are tracked and can be invalidated early via token/session revocation
Verify Your Setup
coderclaw security audit --deep Key items to verify:
- DM policy is
pairingorallowlist(notopen) allowFromis configured for your channels- Exec security is not set to
fullunless intended - Gateway is bound to loopback or behind authentication
- Workspace doesn't contain secrets
Timeline
WEEK 1-2: Phase 1 - Transparency ├── Threat model development begins (open for contribution) ├── Security configuration guide drafted ├── Visual overview created └── Announcement posted WEEK 3-4: Phase 2 - Product Security Roadmap ├── GitHub issues created for defensive engineering goals ├── Security label and milestone set up ├── Community input period opens └── First security work begins WEEK 5-8: Phase 3 - Code Review Preparation ├── Scope finalized (entire codebase) ├── Review begins └── Initial findings WEEK 8-12: Phase 3 - Code Review Execution ├── Manual review completed ├── Findings documented ├── Remediation for critical/high └── Verification completed WEEK 8+: Phase 4 - Triage Function ├── security@coderclaw.ai live ├── PGP key published ├── Disclosure policy published └── First advisories (if needed) ONGOING: ├── Monthly security updates ├── Continuous threat model refinement ├── Regular dependency auditing └── Community engagement
FAQ
Yes, with proper configuration. CoderClaw has security controls enabled by default:
- DM Policy: Defaults to
pairing- unknown senders must complete a pairing flow with an expiring code - Exec Security: Defaults to
denywithask: on-miss- dangerous commands require approval - AllowFrom: If not configured, defaults to self-only
- Gateway Auth: Required by default
Run coderclaw security audit --deep to verify your setup. See docs.coderclaw.ai/gateway/security
These attack techniques are already public knowledge - documented in papers, blogs, and talks. Developing openly benefits from collective expertise, builds trust through transparency, and holds us accountable.
We receive reports from automated scanners and AI tools that flag theoretical issues without understanding them. By requiring reporters to propose a fix, we filter out scanner noise, get actionable reports from researchers who understand the issues, and speed up remediation with expert input.
CoderClaw Skills is in scope for the entire security program - threat model, code review, and ongoing monitoring. Skills are code that runs in your agent's context - supply chain security is critical.
All applications are in scope. The iOS app, Android app, and macOS desktop app will all be covered by the code review and included in the threat model. Nothing is out of scope.
- Contribute to the threat model via pull request
- Review and comment on security-labeled issues
- Report vulnerabilities to the relevant repo (or security@coderclaw.ai if unsure)
- Help improve security documentation