Voltropy's new paper doesn't compete with RLM. It builds on it, proves it works, and takes it further than we did.
On February 14, 2026, Voltropy PBC published "LCM: Lossless Context Management" -- a paper describing how they built an agent called Volt that beats Claude Code on the OOLONG benchmark at every single context length from 32K to 1 million tokens.
The win margin is not subtle. At 256K tokens, Volt scores 10 points higher. At 512K tokens, the gap widens to 12.6 points. Average score across all lengths: Volt 74.8, Claude Code 70.3.
But the more important result is not the numbers. It is the architecture. Volt is built on a foundation of recursive context management -- the same core idea behind RLMs. The paper's authors, Clint Ehrlich and Theodore Blackman, are explicit about this: LCM and RLM are "complementary points along a design spectrum," not rivals.
This is what vindication looks like.
LCM is a context management system designed to give agents infinite memory without infinite cost. The core idea is simple: instead of jamming every message into the active context window, you maintain a hierarchical summary DAG with an immutable message store underneath it.
When the context fills up, LCM does not truncate arbitrarily. It uses a three-level escalation strategy:
Level 1: Normal summarization. The model writes a summary of older messages, keeping recent ones verbatim. The summary replaces the originals in the active context. The originals stay in the immutable store.
Level 2: Aggressive summarization. If the context is still too large, LCM re-summarizes the summaries, compressing harder. The full chain is preserved in the store.
Level 3: Deterministic truncation. If even aggressive summarization does not fit, LCM falls back to rule-based truncation -- keep the system prompt, keep the most recent messages, drop the middle. This guarantees convergence.
The result: every message ever sent to the agent is preserved verbatim and can be retrieved on demand. The active context stays bounded. And short tasks pay zero overhead -- if your conversation fits in one window, LCM does nothing.
The paper uses a comparison that gets at the real philosophical difference between LCM and RLM: GOTO versus structured programming.
RLM is GOTO. It gives the model maximum flexibility. The model can write its own loops, branch arbitrarily, and recurse however it wants. This is powerful, but it requires the model to be good at control flow. If the model writes a bad loop, the recursion fails.
LCM is structured control flow. It provides deterministic primitives -- LLM-Map and Agentic-Map -- that replace model-written loops with operator-level recursion. The model does not write the recursion; it defines the operation to apply, and the system handles the iteration.
LLM-Map takes a list and a prompt template. For each item, it invokes the LLM with the template filled in. No recursion, no loops, no control flow bugs. Just: here is a list, here is what to do with each element, go.
Agentic-Map is the same idea, but each iteration is a full agent invocation instead of a single LLM call. This is for tasks where each list item requires multi-step reasoning -- like processing a directory of files where each file might need clarification, retries, or sub-tasks.
The trade-off is explicit. RLM gives you a Turing-complete REPL. LCM gives you safe, guaranteed-termination operators. RLM is more flexible. LCM is more reliable.
The instinct when another team publishes results that beat yours is to treat it as competition. That instinct is wrong here.
LCM does not replace RLM. It extends it. The hierarchical summary DAG, the immutable message store, the recursive invocation primitives -- all of these are implementations of the same insight that drives RLM: you manage context by treating it as an external environment the model can programmatically explore, not a fixed input you shove through the neural network.
The MIT team that built RLM optimized for flexibility. The Voltropy team optimized for reliability. Both approaches share the fundamental architecture: recursive invocation, bounded context per call, external memory that the model reads and writes programmatically.
The paper even says it outright in the conclusion: "These two approaches need not be mutually exclusive -- just as GOTO remains available in modern programming languages for the rare cases where it is the right tool." LCM and RLM can coexist. They solve overlapping but distinct problems.
Zero-cost continuity. If your task fits in one context window, LCM does nothing. No summarization overhead, no extra invocations, no performance hit. This is critical for adoption -- you do not pay for features you are not using.
Lossless retrievability. Every message is preserved verbatim in the immutable store. Summaries are lossy, but the originals are always available. If the model needs to go back and re-read something in full, it can. This is the answer to the "what if the summary drops something important" concern.
Guaranteed convergence. The three-level escalation ensures that the active context always fits, no matter what. Even if summarization fails completely, deterministic truncation catches it. This means LCM can handle adversarial inputs that are deliberately designed to break summarization.
Scope-reduction invariant. Agentic-Map enforces a rule: each recursive call must operate on a strict subset of the parent's scope. This prevents infinite recursion. The system has a formal proof that every Agentic-Map invocation terminates. That is a strong claim, and it matters for production deployments where runaway recursion could burn budget or lock up a service.
Exploration summaries for large files. When the model needs to work with a file larger than the context window, LCM does not just truncate it. It reads the file in chunks, writes a summary of each chunk, and presents the summaries to the model along with the ability to request full chunks on demand. This is the RLM pattern applied to file handling instead of conversational memory.
Volt, the agent built on LCM, is a fork of OpenCode (itself a Claude Code clone). The architecture is Claude Opus 4.6 running through the LCM context manager. It does not use a bigger model. It does not have a bigger context window. It just manages context better.
On OOLONG, a benchmark designed to test long-context coding tasks, Volt outperforms Claude Code at every tested length: 32K, 64K, 128K, 256K, 512K, and 1M tokens. The performance gap widens as context grows -- exactly what you would expect if the problem is context management, not model capacity.
At 256K tokens, Volt scores 84.8 versus Claude Code's 74.8. At 512K, it is 87.4 versus 74.8. These are not minor improvements. They are categorical differences in capability.
The takeaway: better context management beats bigger context windows. This is the core thesis of RLM, now empirically validated by an independent team on a widely-used benchmark.
The RLM paradigm is no longer speculative. It is proven. A second team built a production-grade system on the same principles, beat the state-of-the-art agent from Anthropic, and published their results.
The design space is wide open. MIT's RLM optimized for expressiveness -- give the model a REPL and let it write programs. Voltropy's LCM optimized for safety and reliability -- give the model structured operators with guaranteed termination. There is room for both, and probably room for a dozen other points along the spectrum.
The common thread is this: recursive context management is the right abstraction for long-context tasks. Not bigger windows. Not better attention mechanisms. Recursion.
The context window race was never going to solve the hard problems. LCM proves it. RLM predicted it. The era of treating LLMs as stateless functions that process their entire input in one forward pass is ending. The era of treating them as agents that recursively explore external state is beginning.
That is not a competitor to RLM. That is the paradigm winning.