Work in progress. These docs are minimal and will evolve.Recursive Language Models (RLMs) are a task-agnostic inference paradigm for language models to handle near-infinite length contexts by enabling the LM to programmatically examine, decompose, and recursively call itself over its input. RLMs replace the canonical
llm.completion(prompt, model) call with a rlm.completion(prompt, model) call. RLMs offload the context as a variable in a REPL environment that the LM can interact with and launch sub-LM calls inside of.
Installation
We useuv, but any virtual environment works.
Quick Start
OpenAI
Anthropic
Portkey
REPL Environments
RLMs execute LM-generated Python code in a sandboxed REPL environment. We support two types of environments: non-isolated and isolated.Non-isolated environments
local(default): Same-process execution with sandboxed builtins. Fast but shares memory with host.docker: Containerized execution in Docker. Better isolation, reproducible environments.
Isolated environments
modal: Cloud sandboxes via Modal. Production-ready, fully isolated from host.
Configuration examples
environments for details on each environment’s architecture and configuration.
Core Components
RLMs indirectly handle contexts by storing them in a persistent REPL environment, where an LM can view and run code inside of. It also has the ability to sub-query (R)LMs (i.e. withllm_query calls) and produce a final answer based on this. This design generally requires:
- Set up a REPL environment, where state is persisted across code execution turns.
- Put the prompt (or context) into a programmatic variable.
- Allow the model to write code that peeks into and decomposes the variable, and observes any side effects.
- Encourage the model, in its code, to recurse over shorter, programmatically constructed prompts.