A community developer (Claudio Drews) shipped Memory OS today, an MIT-licensed 6-layer memory stack that sits on top of NousResearch's Hermes Agent framework. The layers map to distinct memory purposes: workspace files (MEMORY.md, USER.md, CREATIVE.md injected into system prompt), session history (SQLite + FTS5 full-text search), structured facts (SQLite + HRR + trust scoring with feedback loops), Fabric (LLM-powered session extraction, forked from Icarus Plugin), vector database (Qdrant with 4096-dimensional cosine vectors plus BM25 sparse search), and an auto-curated LLM Wiki that gets re-ingested back into Qdrant. Everything runs locally via Docker (Qdrant + Redis + ARQ Worker + Python 3.11+); LLM calls still go to whatever provider Hermes supports (OpenRouter, OpenAI, Anthropic, Ollama).

The architecture is a cascading retrieval system. On pre_llm_call, four sources (Fabric, Qdrant, Sessions, Facts) get queried with hybrid search first, then dense vectors, then lexical, then SQLite as fallback. Per-session deduplication stops the same context from appearing twice in a window, and a relevance threshold filters before anything reaches the model. On post_llm_call and on_session_end, capture writes back into the layers. Forgetting is explicit: a weekly decay scanner ages out stale entries, and semantic dedup merges near-identical memories when cosine similarity exceeds 0.92. Trust scoring (Layer 3) maintains a feedback loop that adjusts source-credibility over time. What is missing matters: no published benchmarks on recall quality, latency, or token savings, and no formal write-conflict resolution for multi-agent consistency. The repo is at github.com/ClaudioDrews/memory-os, brand new with few commits, single developer. Treat as early-stage scaffolding for the layered idea, not as a production-hardened component yet.

Two ecosystem threads. First, the architecture is a community-built explicit articulation of what most agent builders end up assembling ad-hoc: a layered memory pipeline where different purposes (system context, session recall, fact storage, semantic search, structured extraction, knowledge accumulation) get distinct mechanisms instead of being collapsed into one "vector DB and pray" pattern. That layering is the design lesson worth carrying even if you never touch Memory OS directly. Second, the positioning: Memory OS explicitly contrasts itself with cloud services like mem0, Zep, and Letta by being fully local. That fits the broader wave of "I want my agent memory on my disk, not in someone else's database," which is a legitimate stance for privacy-sensitive or air-gapped builds. The trade-off is operational: a Docker stack with Qdrant + Redis + ARQ Worker is more infrastructure than calling mem0 via an API. For builders deciding between cloud-managed and self-hosted memory, this puts a real option on the local side of the ledger, though framework-bound to Hermes.

Monday morning, if you are already building on Hermes Agent: Memory OS is worth trying as a drop-in upgrade from Hermes's native memory, especially if your agent benefits from semantic recall across long histories. Run your own evaluation on recall quality and token use before betting on it; no published benchmarks means you are the benchmark. If you are not on Hermes, the 6-layer decomposition is the takeaway, not the code: workspace + sessions + facts + extraction + vectors + curated wiki maps to distinct purposes you probably already need, separately, in whatever stack you are on. If you are shipping a privacy-sensitive or air-gapped agent and do not want a cloud memory provider, this is a fresh option to evaluate against established alternatives. Single-developer plus few-commits-old means lean on it carefully; the value right now is the architecture as documentation, not the code as a battle-tested dependency.