Skip to main content

RLM Class Reference

Complete API documentation for the core RLM class.

Overview

The RLM class is the main entry point for Recursive Language Model completions. It wraps an LM client and execution environment to enable iterative, code-augmented reasoning.
from rlm import RLM

rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5"},
)

Constructor

RLM(
    backend: str = "openai",
    backend_kwargs: dict | None = None,
    environment: str = "local",
    environment_kwargs: dict | None = None,
    depth: int = 0,
    max_depth: int = 1,
    max_iterations: int = 30,
    custom_system_prompt: str | None = None,
    other_backends: list[str] | None = None,
    other_backend_kwargs: list[dict] | None = None,
    logger: RLMLogger | None = None,
    verbose: bool = False,
)

backend

Type: Literal["openai", "portkey", "openrouter", "vllm", "litellm", "anthropic"]
Default: "openai"
The LM provider backend to use for the root model.
# OpenAI
rlm = RLM(backend="openai", ...)

# Anthropic
rlm = RLM(backend="anthropic", ...)

# Local vLLM server
rlm = RLM(backend="vllm", ...)

backend_kwargs

Type: dict[str, Any] | None
Default: None
Configuration passed to the LM client. Required fields vary by backend:
BackendRequiredOptional
openaimodel_nameapi_key, base_url
anthropicmodel_nameapi_key
portkeymodel_name, api_keybase_url
openroutermodel_nameapi_key
vllmmodel_name, base_url
litellmmodel_namevaries by provider
backend_kwargs = {
    "api_key": "sk-...",
    "model_name": "gpt-4o",
    "base_url": "https://api.openai.com/v1",
}

environment

Type: Literal["local", "modal", "docker"]
Default: "local"
The execution environment for running generated code.
EnvironmentDescription
localSame-process execution with sandboxed builtins
dockerContainerized execution in Docker
modalCloud sandbox via Modal

environment_kwargs

Type: dict[str, Any] | None
Default: None
Configuration for the execution environment: Local:
environment_kwargs = {
    "setup_code": "import numpy as np",
}
Docker:
environment_kwargs = {
    "image": "python:3.11-slim",
}
Modal:
environment_kwargs = {
    "app_name": "my-rlm-app",
    "timeout": 600,
    "image": modal.Image...,
}

max_depth

Type: int
Default: 1
Maximum recursion depth for nested RLM calls. Currently only depth 1 is fully supported. When depth >= max_depth, the RLM falls back to a regular LM completion.

max_iterations

Type: int
Default: 30
Maximum number of REPL iterations before forcing a final answer. Each iteration consists of:
  1. LM generates response (potentially with code blocks)
  2. Code blocks are executed
  3. Results are appended to conversation history
rlm = RLM(
    ...,
    max_iterations=50,
)

custom_system_prompt

Type: str | None
Default: None
Override the default RLM system prompt. The default prompt instructs the LM on:
  • How to use the context variable
  • How to call llm_query() and llm_query_batched()
  • How to signal completion with FINAL()
custom_prompt = """You are a data analysis expert.
Use the REPL to analyze the context variable.
When done, output FINAL(your answer)."""

rlm = RLM(
    ...,
    custom_system_prompt=custom_prompt,
)

other_backends / other_backend_kwargs

Type: list[str] | None / list[dict] | None
Default: None
Register additional LM backends available for sub-calls via llm_query().
rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-4o"},
    other_backends=["anthropic", "openai"],
    other_backend_kwargs=[
        {"model_name": "claude-sonnet-4-20250514"},
        {"model_name": "gpt-4o-mini"},
    ],
)

# Inside REPL, code can call:
# llm_query(prompt)  # Uses default (gpt-4o)
# llm_query(prompt, model="claude-sonnet-4-20250514")  # Uses Claude
# llm_query(prompt, model="gpt-4o-mini")  # Uses GPT-4o-mini

logger

Type: RLMLogger | None
Default: None
Logger for saving iteration trajectories to disk.
from rlm.logger import RLMLogger

logger = RLMLogger(log_dir="./logs")
rlm = RLM(..., logger=logger)

verbose

Type: bool
Default: False
Enable rich console output showing:
  • Metadata at startup
  • Each iteration’s response
  • Code execution results
  • Final answer and statistics

Methods

completion()

Main entry point for RLM completions.
def completion(
    self,
    prompt: str | dict[str, Any],
    root_prompt: str | None = None,
) -> RLMChatCompletion

Parameters

prompt The context/input to process. Becomes the context variable in the REPL.
result = rlm.completion("Analyze this text...")

result = rlm.completion({
    "documents": [...],
    "query": "Find relevant sections",
})

result = rlm.completion(["doc1", "doc2", "doc3"])
root_prompt Optional short prompt shown to the root LM. Useful for Q&A tasks where the question should be visible throughout.
result = rlm.completion(
    prompt=long_document,
    root_prompt="What is the main theme of this document?"
)

Returns

RLMChatCompletion dataclass:
@dataclass
class RLMChatCompletion:
    root_model: str
    prompt: str | dict
    response: str
    usage_summary: UsageSummary
    execution_time: float

Example

result = rlm.completion(
    "Calculate the factorial of 100 and return the number of digits."
)

print(result.response)
print(result.execution_time)
print(result.usage_summary.to_dict())

Response Types

RLMChatCompletion

from rlm.core.types import RLMChatCompletion

result: RLMChatCompletion = rlm.completion(...)

result.root_model
result.prompt
result.response
result.execution_time
result.usage_summary

UsageSummary

from rlm.core.types import UsageSummary

usage: UsageSummary = result.usage_summary
usage.to_dict()

Error Handling

RLM follows a “fail fast” philosophy:
rlm = RLM(
    backend="vllm",
    backend_kwargs={"model_name": "llama"},
)

rlm = RLM(backend="unknown")
If the RLM exhausts max_iterations without finding a FINAL() answer, it prompts the LM one more time to provide a final answer based on the conversation history.

Thread Safety

Each completion() call:
  1. Spawns its own LMHandler socket server
  2. Creates a fresh environment instance
  3. Cleans up both when done
This makes completion() calls independent, but the RLM instance itself should not be shared across threads without external synchronization.

Example: Full Configuration

import os
from rlm import RLM
from rlm.logger import RLMLogger

logger = RLMLogger(log_dir="./logs", file_name="analysis")

rlm = RLM(
    backend="anthropic",
    backend_kwargs={
        "api_key": os.getenv("ANTHROPIC_API_KEY"),
        "model_name": "claude-sonnet-4-20250514",
    },
    environment="docker",
    environment_kwargs={
        "image": "python:3.11-slim",
    },
    other_backends=["openai"],
    other_backend_kwargs=[{
        "api_key": os.getenv("OPENAI_API_KEY"),
        "model_name": "gpt-4o-mini",
    }],
    max_iterations=40,
    max_depth=1,
    logger=logger,
    verbose=True,
)

result = rlm.completion(
    prompt=massive_document,
    root_prompt="Summarize the key findings"
)