Trust - CoderClaw

Security posture, roadmap, and how we think about agents that can take real-world actions.

A New Era in Computing Security

For the past 20 years, security models have been built around locking devices and applications down - setting boundaries between inter-process communications, separating internet from local, sandboxing untrusted code. These principles remain important.

But AI agents represent a fundamental shift.

Unlike traditional software that does exactly what code tells it to do, AI agents interpret natural language and make decisions about actions. They blur the boundary between user intent and machine execution. They can be manipulated through language itself.

We understand that with the great utility of a tool like CoderClaw comes great responsibility. Done wrong, an AI agent is a liability. Done right, we can change personal computing for the better.

This security program exists to get it right.

Context

CoderClaw is an AI agent platform. Unlike chatbots that only generate text, CoderClaw agents can:

Execute shell commands on the host machine
Send messages through WhatsApp, Telegram, Discord, Slack, and other channels
Read and write files in the workspace
Fetch arbitrary URLs from the internet
Schedule automated tasks
Access connected services and APIs

This capability is what makes CoderClaw useful. It's also what makes security critical.

AI agents that can take real-world actions introduce risks that traditional software doesn't have:

Prompt injection - Malicious users can craft messages that manipulate the AI into performing unintended actions
Indirect injection - Malicious content in fetched URLs, emails, or documents can hijack agent behavior
Tool abuse - Even without injection, misconfigured agents can cause damage through overly permissive settings
Identity risks - Agents can send messages as you, potentially damaging relationships or reputation

These aren't theoretical. They're documented attack patterns that affect all AI agent systems.

Scope

This security program covers the entire CoderClaw ecosystem. Nothing is out of scope.

Core Platform

CoderClaw CLI and Gateway (coderclaw)
Agent execution engine
Tool implementations
Channel integrations (WhatsApp, Telegram, Discord, Slack, Signal, etc.)

Applications

macOS desktop application
iOS mobile application
Android mobile application
Web interface

Services

CoderClaw Skills - Skills marketplace and registry
Documentation (docs.coderclaw.ai)
Any hosted infrastructure

Extensions

Official extensions (extensions/)
Plugin SDK and third-party plugins
Skills distributed through CoderClaw Skills

People

Core maintainers and contributors
Security processes and response procedures
Supply chain and dependency management

Program Overview

We're establishing a formal security function with four phases:

Transparency

Develop threat model openly with community contribution

Product Security Roadmap

Define defensive engineering goals and track publicly

Code Review

Manual security review of entire codebase

Security Triage

Formal process for handling vulnerability reports

Phase 1: Transparency

Goal

Develop and publish our threat model openly, inviting community contribution, so users understand the risks and can make informed decisions about their deployments.

Why

Security through obscurity doesn't work. Attackers already know these techniques - they're documented in academic papers, security blogs, and conference talks. What's missing is clear communication to users about:

What risks exist
What we're doing about them
What users should do to protect themselves

By developing the threat model openly, we benefit from collective expertise and build trust through transparency.

Threat Model Coverage

Category	Risks Covered
A. Input Manipulation	Direct prompt injection, indirect injection, tool argument injection, context manipulation
B. Auth & Access	AllowFrom bypass, privilege escalation, cross-session access, API key exposure
C. Data Security	System prompt disclosure, workspace exposure, memory leakage, data exfiltration
D. Infrastructure	SSRF, gateway exposure, dependency vulnerabilities, file permissions
E. Operations	Logging sensitive data, insufficient monitoring, resource exhaustion, misconfiguration
F. Supply Chain	CoderClaw Skills skills integrity, extension security, dependency vulnerabilities

Threat Model Scope

Component	Why It's Included
Core platform (CLI, Gateway, agents, tools)	Primary attack surface
CoderClaw Skills	Skills marketplace - supply chain risk
Mobile apps (iOS, Android)	Agent control interface, credential storage
Desktop app (macOS)	Gateway host, system integration
Extensions and plugins	Third-party code execution
Build and release pipeline	Distribution integrity

Each risk in the threat model will include description and severity rating, attack examples, current mitigations, known gaps, and user recommendations.

The threat model will be open for community contribution via pull requests.

Phase 2: Product Security Roadmap

Goal

Create a public product security roadmap defining defensive engineering goals, tracked as GitHub issues so the community can follow progress, provide input, and contribute.

Defensive Engineering Goals

Category	Goal	Description
Prompt Injection Protection	Input validation	Pattern detection and alerting for injection attempts
	Tool confirmation	Require explicit approval for sensitive actions
	Context isolation	Prevent cross-session contamination
Privacy Enhancements	System prompt protection	Prevent disclosure of system prompts
	Data minimization	Reduce unnecessary data retention
	Audit logging	Clear visibility into agent actions
Access Control	Fine-grained permissions	Per-tool, per-session access controls
	Rate limiting	Prevent resource exhaustion
	Spending controls	Hard limits on API costs
Supply Chain	Skills verification	Integrity checks for CoderClaw Skills skills
	Dependency auditing	Automated vulnerability scanning
	Signed releases	Cryptographic verification of updates

Specific priority issues will be identified through the Phase 3 code review and added to the public roadmap as they are discovered and triaged.

Phase 3: Code Review

Goal

While the community is already working around the clock to find and address flaws - and we're grateful for every contribution - we recognize this is an open-source project where contributors provide their best effort. Phase 3 is a dedicated, comprehensive security assessment specifically designed to drive out deeply rooted systemic issues and improve overall code quality and user safety.

Scope

The code review covers the entire CoderClaw codebase and ecosystem:

Area	Path	Why
Agent execution	`src/agents/`	Core attack surface - how agents run
Tool implementations	`src/agents/tools/`	What agents can do - exec, messaging, web
Message processing	`src/auto-reply/`	Entry point for all user input
Security utilities	`src/security/`	Existing security controls
Gateway server	`src/gateway/`	Network-exposed component
Authentication	`src//auth`	Credential handling, API keys
Session management	`src/config/sessions.ts`	Cross-session isolation
Pairing and access control	`src/pairing/`, `src//access-control`	DM and group gating
External content handling	`src/security/external-content.ts`	Injection defenses
macOS desktop app	`apps/macos/`	Gateway host, system integration
iOS mobile app	`apps/ios/`	Agent control, credential storage
Android mobile app	`apps/android/`	Agent control, credential storage
CoderClaw Skills	coderclaw.ai/skills	Skills registry - supply chain risk
Official extensions	`extensions/`	First-party plugins
Build and release pipeline	CI/CD, scripts	Distribution integrity, signing

Approach

Manual code review - Human-in-the-loop analysis of security-critical paths
Automated scanning - Static analysis, dependency auditing, secret detection
Dynamic testing - Attempting documented attack patterns against running system
Architecture review - Evaluating trust boundaries and data flows

Disclosure

All critical and high findings fixed before public disclosure
Findings summary published after remediation
Full report available on request
CVEs assigned where applicable

Phase 4: Security Triage Function

Goal

Establish a formal process for receiving, triaging, and responding to security vulnerability reports.

Report a Vulnerability

We take security reports seriously. Report vulnerabilities directly to the repository where the issue lives:

Core CLI and gateway - seanhogg/coderclaw
macOS desktop app - seanhogg/coderclaw (apps/macos)
iOS app - seanhogg/coderclaw (apps/ios)
Android app - seanhogg/coderclaw (apps/android)
CoderClaw Skills - seanhogg/coderclaw
Trust and threat model - coderclaw/trust

For issues that don't fit a specific repo, or if you're unsure, email security@coderclaw.ai and we'll route it.

Required in Reports

Title Severity Assessment Impact Affected Component Technical Reproduction Demonstrated Impact Environment Remediation Advice

Reports without reproduction steps, demonstrated impact, and remediation advice will be deprioritized. Given the volume of AI-generated scanner findings, we must ensure we're receiving vetted reports from researchers who understand the issues.

Response SLAs

Severity	Definition	First Response	Triage	Fix Target
Critical	RCE, auth bypass, mass data exposure	24 hours	48 hours	7 days
High	Significant impact, single-user scope	48 hours	5 days	30 days
Medium	Limited impact, requires specific conditions	5 days	14 days	90 days
Low	Minor issues, defense in depth	14 days	30 days	Best effort

Our Commitments

Acknowledge all complete reports within 48 hours
Provide status updates at least every 14 days
Credit researchers in advisories (unless anonymity requested)
Not pursue legal action against good-faith security research
Consider bounties for qualifying critical/high findings (case-by-case)

Security & Trust

Sean Hogg leads Security & Trust at CoderClaw.

Sean oversees threat modeling, secure architecture, vulnerability response, and rollout of practical controls such as optional TOTP MFA, recovery codes, session/token revocation, and legal-compliance controls across CoderClaw services.

Responsibilities

Lead threat modeling and risk assessment
Scope and oversee code review
Establish triage process and response procedures
Review security-critical code changes
Provide guidance on security architecture decisions

Recent Security Enhancements (Q1 2026)

Terms/version enforcement: users must accept the currently active Terms version before accessing protected application areas.
Versioned legal documents: Terms of Use and Privacy Policy are now versioned and published with timestamps.
Tracked legal acceptance: accepted Terms version is recorded per user and re-consent is required after Terms updates.
Superadmin legal controls: platform admins can publish new Terms versions and immediately enforce re-acceptance.
Expanded account defenses: MFA, session visibility/revocation, and token lifecycle controls are integrated across web, tenant, and admin surfaces.

Current Security Posture

CoderClaw already has security controls in place. Understanding what exists helps users configure their deployments appropriately.

Secure by Default

DM Policy: Pairing

Unknown senders must complete pairing flow with expiring code

Exec Security: Deny

Commands not on allowlist are denied by default, user prompted for approval

Default AllowFrom: Self-only

If not configured, only your own number can DM the agent

Session Isolation

Conversations are isolated per session key

SSRF Protection

Internal IPs and localhost blocked in web_fetch

Gateway Auth Required

WebSocket connections must authenticate

Optional TOTP MFA

Users can enable authenticator-app MFA with QR setup and one-time recovery codes

Session Visibility & Revocation

Users can view active sessions across devices and revoke specific or all other sessions

JWT Token Lifecycle Controls

Issued tokens are tracked and can be invalidated early via token/session revocation

Verify Your Setup

coderclaw security audit --deep

Key items to verify:

DM policy is pairing or allowlist (not open)
allowFrom is configured for your channels
Exec security is not set to full unless intended
Gateway is bound to loopback or behind authentication
Workspace doesn't contain secrets

Timeline

WEEK 1-2: Phase 1 - Transparency
├── Threat model development begins (open for contribution)
├── Security configuration guide drafted
├── Visual overview created
└── Announcement posted

WEEK 3-4: Phase 2 - Product Security Roadmap
├── GitHub issues created for defensive engineering goals
├── Security label and milestone set up
├── Community input period opens
└── First security work begins

WEEK 5-8: Phase 3 - Code Review Preparation
├── Scope finalized (entire codebase)
├── Review begins
└── Initial findings

WEEK 8-12: Phase 3 - Code Review Execution
├── Manual review completed
├── Findings documented
├── Remediation for critical/high
└── Verification completed

WEEK 8+: Phase 4 - Triage Function
├── security@coderclaw.ai live
├── PGP key published
├── Disclosure policy published
└── First advisories (if needed)

ONGOING:
├── Monthly security updates
├── Continuous threat model refinement
├── Regular dependency auditing
└── Community engagement

FAQ

"Is CoderClaw safe to use right now?"

Yes, with proper configuration. CoderClaw has security controls enabled by default:

DM Policy: Defaults to pairing - unknown senders must complete a pairing flow with an expiring code
Exec Security: Defaults to deny with ask: on-miss - dangerous commands require approval
AllowFrom: If not configured, defaults to self-only
Gateway Auth: Required by default

Run coderclaw security audit --deep to verify your setup. See docs.coderclaw.ai/gateway/security

"Why develop the threat model openly?"

These attack techniques are already public knowledge - documented in papers, blogs, and talks. Developing openly benefits from collective expertise, builds trust through transparency, and holds us accountable.

"Why require remediation advice in vulnerability reports?"

We receive reports from automated scanners and AI tools that flag theoretical issues without understanding them. By requiring reporters to propose a fix, we filter out scanner noise, get actionable reports from researchers who understand the issues, and speed up remediation with expert input.

"What about CoderClaw Skills?"

CoderClaw Skills is in scope for the entire security program - threat model, code review, and ongoing monitoring. Skills are code that runs in your agent's context - supply chain security is critical.

"What about mobile apps and desktop?"

All applications are in scope. The iOS app, Android app, and macOS desktop app will all be covered by the code review and included in the threat model. Nothing is out of scope.

"Can I help?"

Contribute to the threat model via pull request
Review and comment on security-labeled issues
Report vulnerabilities to the relevant repo (or security@coderclaw.ai if unsure)
Help improve security documentation

View Security Roadmap Read Full Documentation