RLM Class Reference

Complete API documentation for the core RLM class.

Overview

The RLM class is the main entry point for Recursive Language Model completions. It wraps an LM client and execution environment to enable iterative, code-augmented reasoning.

from rlm import RLM

rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-5"},
)

Constructor

RLM(
    backend: str = "openai",
    backend_kwargs: dict | None = None,
    environment: str = "local",
    environment_kwargs: dict | None = None,
    depth: int = 0,
    max_depth: int = 1,
    max_iterations: int = 30,
    custom_system_prompt: str | None = None,
    other_backends: list[str] | None = None,
    other_backend_kwargs: list[dict] | None = None,
    logger: RLMLogger | None = None,
    verbose: bool = False,
)

`backend`

Type: Literal["openai", "portkey", "openrouter", "vllm", "litellm", "anthropic"]
Default: "openai" The LM provider backend to use for the root model.

# OpenAI
rlm = RLM(backend="openai", ...)

# Anthropic
rlm = RLM(backend="anthropic", ...)

# Local vLLM server
rlm = RLM(backend="vllm", ...)

`backend_kwargs`

Type: dict[str, Any] | None
Default: None Configuration passed to the LM client. Required fields vary by backend:

Backend	Required	Optional
`openai`	`model_name`	`api_key`, `base_url`
`anthropic`	`model_name`	`api_key`
`portkey`	`model_name`, `api_key`	`base_url`
`openrouter`	`model_name`	`api_key`
`vllm`	`model_name`, `base_url`	—
`litellm`	`model_name`	varies by provider

backend_kwargs = {
    "api_key": "sk-...",
    "model_name": "gpt-4o",
    "base_url": "https://api.openai.com/v1",
}

`environment`

Type: Literal["local", "modal", "docker"]
Default: "local" The execution environment for running generated code.

Environment	Description
`local`	Same-process execution with sandboxed builtins
`docker`	Containerized execution in Docker
`modal`	Cloud sandbox via Modal

`environment_kwargs`

Type: dict[str, Any] | None
Default: None Configuration for the execution environment: Local:

environment_kwargs = {
    "setup_code": "import numpy as np",
}

Docker:

environment_kwargs = {
    "image": "python:3.11-slim",
}

Modal:

environment_kwargs = {
    "app_name": "my-rlm-app",
    "timeout": 600,
    "image": modal.Image...,
}

`max_depth`

Type: int
Default: 1 Maximum recursion depth for nested RLM calls. Currently only depth 1 is fully supported. When depth >= max_depth, the RLM falls back to a regular LM completion.

`max_iterations`

Type: int
Default: 30 Maximum number of REPL iterations before forcing a final answer. Each iteration consists of:

LM generates response (potentially with code blocks)
Code blocks are executed
Results are appended to conversation history

rlm = RLM(
    ...,
    max_iterations=50,
)

`custom_system_prompt`

Type: str | None
Default: None Override the default RLM system prompt. The default prompt instructs the LM on:

How to use the context variable
How to call llm_query() and llm_query_batched()
How to signal completion with FINAL()

custom_prompt = """You are a data analysis expert.
Use the REPL to analyze the context variable.
When done, output FINAL(your answer)."""

rlm = RLM(
    ...,
    custom_system_prompt=custom_prompt,
)

`other_backends` / `other_backend_kwargs`

Type: list[str] | None / list[dict] | None
Default: None Register additional LM backends available for sub-calls via llm_query().

rlm = RLM(
    backend="openai",
    backend_kwargs={"model_name": "gpt-4o"},
    other_backends=["anthropic", "openai"],
    other_backend_kwargs=[
        {"model_name": "claude-sonnet-4-20250514"},
        {"model_name": "gpt-4o-mini"},
    ],
)

# Inside REPL, code can call:
# llm_query(prompt)  # Uses default (gpt-4o)
# llm_query(prompt, model="claude-sonnet-4-20250514")  # Uses Claude
# llm_query(prompt, model="gpt-4o-mini")  # Uses GPT-4o-mini

`logger`

Type: RLMLogger | None
Default: None Logger for saving iteration trajectories to disk.

from rlm.logger import RLMLogger

logger = RLMLogger(log_dir="./logs")
rlm = RLM(..., logger=logger)

`verbose`

Type: bool
Default: False Enable rich console output showing:

Metadata at startup
Each iteration’s response
Code execution results
Final answer and statistics

Methods

`completion()`

Main entry point for RLM completions.

def completion(
    self,
    prompt: str | dict[str, Any],
    root_prompt: str | None = None,
) -> RLMChatCompletion

Parameters

prompt The context/input to process. Becomes the context variable in the REPL.

result = rlm.completion("Analyze this text...")

result = rlm.completion({
    "documents": [...],
    "query": "Find relevant sections",
})

result = rlm.completion(["doc1", "doc2", "doc3"])

root_prompt Optional short prompt shown to the root LM. Useful for Q&A tasks where the question should be visible throughout.

result = rlm.completion(
    prompt=long_document,
    root_prompt="What is the main theme of this document?"
)

Returns

RLMChatCompletion dataclass:

@dataclass
class RLMChatCompletion:
    root_model: str
    prompt: str | dict
    response: str
    usage_summary: UsageSummary
    execution_time: float

Example

result = rlm.completion(
    "Calculate the factorial of 100 and return the number of digits."
)

print(result.response)
print(result.execution_time)
print(result.usage_summary.to_dict())

Response Types

`RLMChatCompletion`

from rlm.core.types import RLMChatCompletion

result: RLMChatCompletion = rlm.completion(...)

result.root_model
result.prompt
result.response
result.execution_time
result.usage_summary

`UsageSummary`

from rlm.core.types import UsageSummary

usage: UsageSummary = result.usage_summary
usage.to_dict()

Error Handling

RLM follows a “fail fast” philosophy:

rlm = RLM(
    backend="vllm",
    backend_kwargs={"model_name": "llama"},
)

rlm = RLM(backend="unknown")

If the RLM exhausts max_iterations without finding a FINAL() answer, it prompts the LM one more time to provide a final answer based on the conversation history.

Thread Safety

Each completion() call:

Spawns its own LMHandler socket server
Creates a fresh environment instance
Cleans up both when done

This makes completion() calls independent, but the RLM instance itself should not be shared across threads without external synchronization.

Example: Full Configuration

import os
from rlm import RLM
from rlm.logger import RLMLogger

logger = RLMLogger(log_dir="./logs", file_name="analysis")

rlm = RLM(
    backend="anthropic",
    backend_kwargs={
        "api_key": os.getenv("ANTHROPIC_API_KEY"),
        "model_name": "claude-sonnet-4-20250514",
    },
    environment="docker",
    environment_kwargs={
        "image": "python:3.11-slim",
    },
    other_backends=["openai"],
    other_backend_kwargs=[{
        "api_key": os.getenv("OPENAI_API_KEY"),
        "model_name": "gpt-4o-mini",
    }],
    max_iterations=40,
    max_depth=1,
    logger=logger,
    verbose=True,
)

result = rlm.completion(
    prompt=massive_document,
    root_prompt="Summarize the key findings"
)

Overview

API

Backends

Environments

Trajectories

RLM Class

RLM Class Reference

Overview

Constructor

`backend`

`backend_kwargs`

`environment`

`environment_kwargs`

`max_depth`

`max_iterations`

`custom_system_prompt`

`other_backends` / `other_backend_kwargs`

`logger`

`verbose`

Methods

`completion()`

Parameters

Returns

Example

Response Types

`RLMChatCompletion`

`UsageSummary`

Error Handling

Thread Safety

Example: Full Configuration

Overview

API

Backends

Environments

Trajectories

​RLM Class Reference

​Overview

​Constructor

​backend

​backend_kwargs

​environment

​environment_kwargs

​max_depth

​max_iterations

​custom_system_prompt

​other_backends / other_backend_kwargs

​logger

​verbose

​Methods

​completion()

​Parameters

​Returns

​Example

​Response Types

​RLMChatCompletion

​UsageSummary

​Error Handling

​Thread Safety

​Example: Full Configuration

RLM Class Reference

Overview

Constructor

`backend`

`backend_kwargs`

`environment`

`environment_kwargs`

`max_depth`

`max_iterations`

`custom_system_prompt`

`other_backends` / `other_backend_kwargs`

`logger`

`verbose`

Methods

`completion()`

Parameters

Returns

Example

Response Types

`RLMChatCompletion`

`UsageSummary`

Error Handling

Thread Safety

Example: Full Configuration