Hermes Agent Memory System

Summary

The Hermes Agent Memory System is a bounded, curated persistence layer that allows the agent to retain user preferences, environment details, and learned information across different sessions. It utilizes a dual-file structure stored locally and injected into the agent’s system prompt as a frozen snapshot at the start of each interaction.

Details

Hermes manages its long-term context through two specific files located in the ~/.hermes/memories/ directory. These files are designed with strict character limits to ensure the system prompt remains focused and does not exceed token constraints.

Memory Structure and Limits

The system is divided into two distinct logical stores:

  • MEMORY.md: Contains the agent’s personal notes regarding the environment, project conventions, and technical lessons learned. It has a limit of 2,200 characters (approximately 800 tokens).
  • USER.md: Stores the user profile, including communication preferences, roles, and expectations. It has a limit of 1,375 characters (approximately 500 tokens).

System Prompt Integration

At the beginning of a session, these files are read from disk and rendered into the system prompt as a “frozen snapshot.” This block includes a header with usage percentages (e.g., MEMORY (your personal notes) [67% -- 1,474/2,200 chars]) and individual entries separated by the section sign (§) delimiter.

Because this is a frozen snapshot, any changes made by the agent during a session are written to disk immediately but do not update the current session’s system prompt. This design preserves the LLM’s prefix cache for better performance. Updated information only appears in the system prompt of the subsequent session, though tool responses provide the agent with the live state during the active session.

The Memory Tool

The agent interacts with its memory using a specialized memory tool supporting three actions:

  1. add: Appends a new entry to the specified store.
  2. replace: Updates an existing entry. This uses a unique substring matching logic via an old_text parameter, meaning the agent does not need to provide the full text of the original entry to modify it.
  3. remove: Deletes an entry based on a unique substring match.

There is no explicit read action; the agent is expected to treat the injected memory in its prompt as its primary source of truth.

Capacity Management and Best Practices

When a memory store reaches its character limit, the memory tool returns an error containing the current entries. The agent is then responsible for consolidating or removing less relevant information. A recommended best practice is for the agent to proactively consolidate entries once a store reaches 80% capacity—for instance, merging multiple individual project facts into a single, information-dense entry.

The agent is instructed to save environment facts (OS, tools, project structure), user preferences (communication style), and project conventions (linting rules, build commands). It is specifically directed to skip trivial information, raw data dumps (like large log files), and session-specific ephemera that will not be useful in future contexts.

Honcho Integration

For more advanced user modeling, Hermes supports Honcho, which provides a cross-session and cross-platform memory layer. When enabled via hermes honcho setup, the system operates in a hybrid mode where the local MEMORY.md and USER.md files coexist with Honcho’s persistent cloud-based or external user modeling.