Hermes Agent Security Model

Summary

Hermes Agent implements a defense-in-depth security model with five distinct layers: user authorization via allowlists and DM pairing, dangerous command approval with human-in-the-loop for destructive operations, container isolation via Docker/Singularity/Modal, MCP credential filtering for subprocess isolation, and prompt injection detection in context files. These layers are independently configurable and designed to compose — container backends skip redundant command approval because the container boundary already provides equivalent protection.

Details

Dangerous Command Approval

Before executing any shell command, Hermes checks it against a curated pattern list defined in tools/approval.py. Matches trigger a human approval step rather than executing silently. The system supports three modes configured via approvals.mode in ~/.hermes/config.yaml:

  • manual (default): always prompt for approval on dangerous commands
  • smart: delegates risk assessment to an auxiliary LLM; low-risk commands are auto-approved, genuinely dangerous ones auto-denied, and ambiguous cases escalate to a manual prompt
  • off: disables all approval checks, equivalent to --yolo

Approval timeout defaults to 60 seconds and is fail-closed — no response means the command is denied.

The dangerous pattern list covers recursive deletes (rm -r), world-writable chmod operations, filesystem formatting (mkfs), disk writes (dd if=, > /dev/sd), destructive SQL (DROP TABLE, DELETE FROM without WHERE, TRUNCATE), system config overwrites (> /etc/), service disruption (systemctl stop/disable/mask), fork bombs, remote-to-shell piping (curl ... | sh), and self-termination prevention (pkill hermes).

YOLO mode bypasses all approval prompts for a session. It can be toggled three ways: the --yolo CLI flag at startup, the /yolo slash command (a toggle), or HERMES_YOLO_MODE=1 in the environment. Container backends (docker, singularity, modal, daytona) skip dangerous command checks entirely because the container is the security boundary.

In interactive CLI sessions, approval prompts offer four choices: once, session (for the rest of the session), always (saved permanently to config.yaml under command_allowlist), or deny (default). In gateway/messaging sessions, the agent sends the command details to chat and waits for a plain-language yes/no reply.

User Authorization (Gateway)

The gateway’s _is_user_authorized() method checks a layered authorization chain in order: per-platform allow-all flag, DM pairing approved list, platform-specific allowlist, global allowlist (GATEWAY_ALLOWED_USERS), global allow-all (GATEWAY_ALLOW_ALL_USERS=true), then default-deny. All allowlists are configured via ~/.hermes/.env as comma-separated user IDs.

If no allowlists are configured and GATEWAY_ALLOW_ALL_USERS is not set, all users are denied and a warning is logged at startup.

DM Pairing provides flexible authorization without requiring upfront user IDs. When an unknown user messages the bot, it replies with a cryptographically random 8-character pairing code (drawn from a 32-character unambiguous alphabet excluding 0/O/1/I). The bot owner then runs hermes pairing approve <platform> <code> to permanently approve the user. Security features follow OWASP and NIST SP 800-63-4 guidance: 1-hour code TTL, 1 request per user per 10 minutes, max 3 pending codes per platform, 5 failed approval attempts triggering a 1-hour lockout, chmod 0600 on all pairing data files, and codes never logged to stdout.

Pairing data lives in ~/.hermes/pairing/ as per-platform JSON files. The unauthorized_dm_behavior setting (global or per-platform) controls whether unknown DMs receive a pairing code (pair, the default) or are silently dropped (ignore).

Container Isolation

The Docker terminal backend applies a fixed security baseline to every container via _SECURITY_ARGS in tools/environments/docker.py: --cap-drop ALL, --security-opt no-new-privileges, --pids-limit 256, and size-limited tmpfs mounts for /tmp, /var/tmp, and /run. Resource limits (CPU, memory, disk) are configurable in config.yaml under terminal. Filesystem persistence is opt-in: persistent mode bind-mounts ~/.hermes/sandboxes/docker/<task_id>/ for /workspace and /root; ephemeral mode uses tmpfs for the workspace.

Environment Variable Passthrough

Both execute_code and terminal strip sensitive environment variables from child processes by default. Two mechanisms grant deliberate passthrough: skill-declared required_environment_variables in a skill’s SKILL.md frontmatter (automatic, and since v0.5.1 also forwarded into Docker and Modal without needing docker_forward_env), and manual terminal.env_passthrough entries in config.yaml for vars not belonging to any skill. Credential files (e.g., OAuth tokens) are declared via required_credential_files in skill frontmatter and are mounted read-only into Docker containers.

MCP Credential Handling

MCP stdio subprocesses receive only a minimal safe environment: PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR, and any XDG_* variables. All other host environment variables are stripped. Variables explicitly configured in an MCP server’s env block are the sole passthrough mechanism. Error messages from MCP tools are sanitized before being returned to the LLM — patterns matching GitHub PATs, OpenAI-style keys, bearer tokens, and common secret parameter names are replaced with [REDACTED].

Production Deployment Guidance

Key recommendations: set explicit per-platform allowlists (never GATEWAY_ALLOW_ALL_USERS=true in production), use terminal.backend: docker for gateway deployments, restrict resource limits, store secrets in ~/.hermes/.env with correct file permissions, use DM pairing over hardcoded user IDs, periodically audit command_allowlist, set MESSAGING_CWD to a non-sensitive directory, run as non-root, and monitor ~/.hermes/logs/ for unauthorized access attempts.