How Hermes Agent memory actually works: MEMORY.md, USER.md, Honcho, providers, and the problems people keep running into
Hermes memory is less mysterious than the marketing makes it sound. There is a small built-in memory layer, a user profile layer, searchable session history in SQLite with no hard cap, optional skill memory that learns from experience, and external providers such as Honcho that add deeper user modeling. Separate those layers, and it all makes sense. Blur them together, and it feels magical right up until it feels broken.
USER.md
Session search
Skill memory
Honcho + provider plugins
MEMORY.mdUSER.mdMy take up front: Hermes memory is better understood as a layered recall system than a single “AI memory” feature. The base layer is small, curated, and always injected. Session history is broad, searched on demand, and has no hard storage cap. Skill memory captures how the agent solved past tasks so it can repeat and improve them. External providers extend the system without replacing the built-in files. Once you see those layers separately, it becomes much easier to debug what Hermes remembered, what it forgot, and what never belonged in memory to begin with.
Last reviewed: 12 April 2026. Method: Testing verified against the official memory, memory-providers, Honcho, and prompt-assembly docs, recent GitHub issues on replay waste and Honcho delays, Reddit and social search results, and current Hermes YouTube videos. If you are starting from zero, read the main Hermes setup guide first. If you want to understand how skills extend the memory system, read the Hermes skills guide. If you already know OpenClaw memory and want the contrast, read How OpenClaw memory actually works right after this.
Short answer: does Hermes Agent really remember things?
Yes, but the scope matters. Hermes persists two small files in ~/.hermes/memories/: MEMORY.md for environment facts, conventions, and lessons learned, and USER.md for your preferences and communication style. At the start of each session, Hermes injects both into the system prompt as a frozen snapshot. Changes made mid-session are written to disk immediately but do not become part of the prompt until the next session starts. This is one of the most important details in the whole memory system.
Beyond those two files, every conversation is also stored in ~/.hermes/state.db, a SQLite database with FTS5 full-text search indexing. That archive has no hard capacity limit. Users running Hermes for months with thousands of sessions report no performance degradation. When the agent decides that past context is relevant to the current task, it runs a search query against that archive and pulls in the results. This is why people on social media say Hermes “remembers everything” and “goes on forever.” The prompt-level memory is small and curated, but the session archive behind it is effectively unbounded.
Why people say Hermes “remembers everything”
This is the part that has been blowing up on social media, and it deserves its own section because the article above only tells half the story if you stop at the 2,200-character MEMORY.md limit.
Hermes has a genuine self-improving loop that most competing agents do not. Every conversation gets archived in SQLite with full-text search. When Hermes solves a complex task, it can extract that solution into a reusable skill document stored in ~/.hermes/skills/ as a standalone Markdown file. Next time a similar task appears, the agent retrieves the skill and executes it more efficiently. The agent working with you in month three is measurably more capable on your specific work than the agent you started with in month one. It gets better, and it does not lose context along the way.
Compare that with OpenClaw’s memory model. OpenClaw uses MEMORY.md, daily notes, and an optional dreaming system to consolidate knowledge. It is transparent, portable, and human-controlled. But OpenClaw does not automatically extract skills or solution patterns from completed tasks. If you want OpenClaw to remember a workflow, you either add it to memory manually or install a skill from ClawHub. There is no episodic session archive with search, no procedural skill layer that writes itself, and no built-in FTS5 index over your complete conversation history.
Unlimited session archive with FTS5 search. Auto-generated skill memory from completed tasks. A self-improving loop where the agent learns procedures without manual intervention.
Human-first memory curation. A dreaming/consolidation system that prunes and promotes memories overnight. Full manual control over every byte in the memory layer. Arguably more predictable behavior for operators who want total control.
Neither approach is wrong. Hermes optimizes for depth and autonomy. OpenClaw optimizes for transparency and control. The practical difference shows up in long-running engagements: if you work with Hermes daily for two months, the accumulation of session history and skill documents makes the agent noticeably more useful on your specific tasks over time. That is the “goes on forever” effect people are excited about. It is real, and it is the strongest single differentiator Hermes has right now.
The two files that matter most
The built-in memory guide is admirably concrete. MEMORY.md is for environment facts, project conventions, completed work, and lessons learned. USER.md is for identity, preference, tone, timezone, and workflow habits. Both live in ~/.hermes/memories/. Both are intentionally small. Both are meant to stay useful instead of turning into a junk drawer.
~/.hermes/memories/MEMORY.md # agent notes, environment facts, conventions
~/.hermes/memories/USER.md # user profile, preferences, expectations
~/.hermes/state.db # SQLite session archive with FTS5 search
~/.hermes/skills/ # learned procedures from past tasks
The smallness of MEMORY.md and USER.md is deliberate, but it is also where some real-world complaints start. The official docs pitch the limits as a way to keep memory focused. Power users on GitHub issue #5563 argue that the default limit becomes too tight once Hermes is helping with a complex multi-service project. Both things can be true: bounded prompt-level memory is a sane design choice, and the current default can feel undersized for heavy operator work. The saving grace is that session search and skill memory absorb the overflow.
What the memory tool actually does
Hermes uses a memory tool with three actions: add, replace, and remove. There is no separate read action because the model sees memory through the prompt block at session start. Replace and remove use substring matching, so the agent can target existing entries without copying the whole line verbatim. The official docs also explain a practical rule that many users miss: when memory usage climbs above roughly 80 percent, the agent should consolidate or replace older entries before adding new ones.
If Hermes adds memory during your current session, the new entry is on disk right away but it is not in the prompt until the next session. This is why users sometimes think memory “did not work” when it actually did.
Use built-in memory for durable facts you want always available. Use session search for “what did we discuss last week?” questions. Use skill memory for repeatable procedures. Use providers when you want deeper cross-session modeling.
How Honcho changes Hermes memory
Honcho is the memory provider most people talk about because it adds the behavior people intuitively expect from “AI memory”: cross-session user modeling, reasoning over past interactions, semantic search, and multi-agent profile isolation. The official Honcho guide says it adds dialectic reasoning on top of built-in memory rather than replacing the built-in layer. When active, it exposes four extra tools: honcho_conclude, honcho_context, honcho_profile, and honcho_search.
# ~/.hermes/config.yaml
memory:
provider: honcho
honcho:
observation: directional
peer_name: ""
The part I appreciate is that the docs are honest about the different layers. Built-in memory is still there. Honcho runs alongside it. If you want a stable basic setup, start with local memory. If you want deeper user modeling and semantic retrieval, Honcho becomes interesting. Just know that adding a provider means adding a dependency, and dependencies have failure modes.
The problem reports you should know before you blame yourself
Issue one: session replay waste and memory pressure. GitHub issue #5563 is the most detailed operator complaint I have seen so far. The reporter describes long CLI conversations fragmenting into multiple sessions, then replaying the full history as input every time. The claimed impact is brutal: about 2.6 million tokens lost in a single day, with roughly 69 percent of total consumption wasted on replay overhead. The same issue argues that the 2,200-character memory limit is too small for complex production work and proposes visible session-boundary indicators, incremental summaries, DB integrity checks, and larger memory limits.

Issue two: Honcho startup delays. GitHub issue #5726 reports Hermes getting stuck on initialization for over two minutes when Honcho is the provider. The suggested root cause is sequential blocking initialization, plus rate limits and timeouts during session retrieval and sync. The clearest present-day workaround is blunt but useful: switch the memory provider from honcho to local if startup responsiveness matters more than remote memory depth.

Issue three: provider excitement is real, but so is confusion. One Reddit thread is literally titled “Hermes Agent memory/learning – I don’t get it,” which is an unusually accurate headline for the onboarding problem. Another Reddit cluster asks whether users are “stuck with Honcho” or should use alternatives such as Mem0 or Holographic instead. That community pattern matters because it shows where the docs still leave room for interpretation: people understand that Hermes has memory, but they do not always understand which memory layer they are actually turning on.
Issue four: the official direction is toward more modular memory, not less. A Nous post describes Hermes as “the open source agent that grows with you” and highlights a multi-level memory system. The v0.7.0 announcement says memory is now an extensible plugin system with built-in memory working out of the box. The v0.8.0 release—which added over 6,400 GitHub stars in a single day—pushed this even further. That context explains why the memory surface is expanding so quickly and why third-party providers keep appearing.

What you should save, and what you should not
| Save to memory | Do not save to memory |
|---|---|
| User preferences, environment facts, conventions, corrections, completed work, recurring requests | Raw logs, giant code blocks, trivial facts, one-off paths, session-only clutter |
| Things that should be present every new session | Things you can easily rediscover with session search or file reads |
| Compact, information-dense entries | Verbose diary-style paragraphs that waste the small prompt budget |
In other words, Hermes memory should feel more like a sharp operator notebook than a knowledge dump. If a fact belongs in every future session, memory is the right place. If it only mattered to one troubleshooting burst, it belongs in session history instead—where Hermes can still find it when asked.
How to verify Hermes memory is actually working
- Add one durable preference or environment fact and finish the session.
- Start a brand-new Hermes session rather than continuing the old one.
- Ask a question that depends on the stored fact.
- If you are using Honcho or another provider, run the relevant status command and check for provider connectivity before judging the result.
- If performance is erratic, temporarily switch to local memory and compare behavior.
The new-session step is the key. A surprising number of “memory is broken” reports are really “memory was updated after the prompt snapshot was already frozen.” That is a design choice, not a bug. Source
The minimum memory setup I would actually recommend
Beginner path
Use local built-in memory first. Learn the frozen snapshot behavior and keep entries short. Let session search handle the overflow.
Operator path
Lean on session search for recall instead of stuffing more into built-in memory. Start watching the ~/.hermes/skills/ folder to see what the agent is learning on its own.
Advanced path
Turn on Honcho or another provider only after you want deeper cross-session modeling and you are ready to watch startup behavior. Combine with skill memory for the full self-improving loop.
I would resist the temptation to sell Hermes memory as “it never forgets”—even though the session archive technically does retain everything. The real pitch is stronger than that. Hermes gives you a layered, inspectable, self-improving memory system where the agent gets better at your specific work over time. That is a more durable operator story than vague AI nostalgia, and it is the reason Hermes has gained over 47,000 GitHub stars in less than two months.
Two YouTube videos worth watching before you over-romanticize the memory layer
https://www.youtube.com/watch?v=6M2tItdARew
https://www.youtube.com/watch?v=Ac48sE6FzQ8
The first is a short “never forgets” framing that helps you see the promise. The second is specifically about the v0.7.0 memory update and is more useful if you want the plugin-system context.
FAQ
What is the difference between Hermes built-in memory and session search?
Does Hermes update memory live while I chat?
Do I need Honcho to get memory in Hermes?
Why does Hermes memory sometimes feel too small?
What should I do if Honcho makes Hermes slow to start?
memory.provider to local resolves the lag.How is Hermes memory different from OpenClaw memory?
Related guides and source links
- Hermes Agent Setup Guide for Beginners
- Hermes Agent Skills Guide: How the Skill System Works
- How OpenClaw Memory Actually Works
- OpenClaw Setup Guide for Beginners
- Hermes memory docs
- Honcho provider docs
- Memory providers overview
- GitHub issue #5563 on replay waste and memory pressure
- GitHub issue #5726 on Honcho startup delays

