Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Overcome the Context Window Limitation with Amazon Bedrock AgentCore

Overcoming Context Window Limitations in Document Analysis Using Recursive Language Models


Unlocking Insights Beyond Context Boundaries: A Guide to Recursive Language Models

Introduction to the Challenge of Context Windows

Understanding Recursive Language Models (RLM)

Architectural Framework for RLM Implementation

Step-by-Step Guide to Implementing RLM

Pre-Conditions for Successful RLM Implementation

Evaluation: Effectiveness of RLM Compared to Traditional Methods

Scaling RLM for Code Repository Analysis

Real-World Application: RLM in Action

Key Trade-offs and Considerations for RLM Adoption

Conclusion: Enhancing Document Analysis with RLM


References

About the Authors

Unlocking Document Analysis with Recursive Language Models

When dealing with vast documents that stretch across millions of characters, traditional language models (LMs) often struggle with context window limitations. Many find themselves facing the reality that even the largest context windows fall short. This leads to either rejection of the input or incomplete answers due to insufficient context. So, how do you effectively analyze documents that exceed typical limitations?

In this post, we’ll explore how to implement Recursive Language Models (RLMs) using Amazon Bedrock AgentCore Code Interpreter and the Strands Agents SDK. By the end, you will be equipped with knowledge to:

  1. Process documents of varying lengths without any upper context size limitation.
  2. Utilize the Bedrock AgentCore Code Interpreter as a persistent working memory for iterative document analysis.
  3. Organize sub-large language model (sub-LLM) calls within a sandboxed Python environment to analyze specific sections of documents.

Why Context Windows Aren’t Enough

Imagine you’re analyzing financial data across two annual reports from a single company—each report between 300 to 500 pages long. Now add analyst reports, SEC filings, and supplementary materials, and you’re looking at millions of characters. When fed directly into a model, there are two potential pitfalls:

  1. The input might exceed the model’s context window limit, resulting in an error.
  2. The input fits but the model struggles to focus on critical information located in the middle of long inputs—this phenomenon is known as the “lost in the middle” problem.

These issues highlight the fact that context window size is a hard limit that cannot be circumvented solely through prompt engineering. A new approach that decouples document size from the model’s context window is essential.

RLMs: Treating Context as an Environment

Recursive Language Models (as introduced by Zhang et al. in their paper, arXiv:2512.24601), reinterpret context interaction. Instead of feeding the entire document into the model’s context, RLMs consider the input as an external environment that the model can interact with programmatically.

How RLMs Work

  1. Orchestration: The root LLM generates code to explore the document environment.
  2. Delegation: It delegates semantic analysis to sub-LLMs for specific chunks.
  3. Accumulation: Results are stored in a working memory, refining the analysis step-by-step.

This structure allows the root LLM to manage the analysis without ever needing the full document in its context window.

Architecture Overview

The implementation of RLM using Amazon Bedrock AgentCore Code Interpreter involves three main components:

  1. Root LLM Agent: Built with Strands Agents SDK, it receives user queries and decides on the code execution.
  2. Amazon Bedrock AgentCore Code Interpreter: Operates in public network mode, keeping the full document as a Python variable.
  3. Sub-LLM Calls: The root LLM can call sub-LLMs directly from within the Code Interpreter, allowing the results to remain in Python variables.

Diagram of RLM Architecture

(Illustrative figure concept)

The architecture leverages the persistent session state of the Code Interpreter allowing cumulative intermediate results and extracted data.

Implementation Steps

To get started, ensure you meet a few prerequisites:

  • An AWS account with access to Amazon Bedrock foundation models.
  • Python 3.10 or later.
  • Configured AWS Command Line Interface.
  • IAM permissions for necessary Bedrock functions.

Step 1: Initiate a Code Interpreter Session

import boto3
import json

client = boto3.client('bedrock-agentcore', region_name="us-east-1")
response = client.start_code_interpreter_session(
    codeInterpreterIdentifier=code_interpreter_id,
    name="rlm-session",
    sessionTimeoutSeconds=3600
)
session_id = response["sessionId"]

client.invoke_code_interpreter(
    codeInterpreterIdentifier=code_interpreter_id,
    sessionId=session_id,
    name="writeFiles",
    arguments={"content": [{"path": "_context.txt", "text": document}]}
)

Step 2: Define the llm_query Helper Within the Sandbox

with open('_context.txt', 'r') as f:
    context = f.read()

def llm_query(prompt: str) -> str:
    response = bedrock_client.invoke_model(
        modelId=sub_model_id,
        body=json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 4096,
            "messages": [{"role": "user", "content": prompt}]
        })
    )
    result = json.loads(response['body'].read())
    return result['content'][0]['text']

Step 3: Create a Strands Agent and Run Your Query

from strands import Agent

agent = Agent(
    model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    system_prompt=rlm_system_prompt,
    tools=[execute_python],
)

answer = agent("What are the key revenue trends across these reports?")

Through this agent, the model iteratively writes and executes code to explore the loaded document.

Evaluation of RLM Effectiveness

In our evaluations, RLM outperformed traditional models significantly on Financial Multi-Document QA tasks, showing a 100% success rate while reducing input limit errors.

Model Approach Success Rate Accuracy
Claude Haiku 4.5 + Haiku 4.5 RLM 100% 66.7%

The RLM architecture not only improves success rates but also boosts accuracy significantly by effectively breaking down complex tasks.

Conclusion

Recursive Language Models present a robust solution for processing large documents that exceed standard model context limits. By leveraging Amazon Bedrock AgentCore Code Interpreter alongside the Strands Agents SDK, RLM can effectively analyze and reason over extensive input data.

This approach is beneficial not just for financial analyses or document reviews, but across multiple domains including healthcare, legal review, and programming tasks.

If you implement this methodology in your projects, share your experiences! Your insights could enrich the conversation surrounding advanced document analysis strategies. Let’s innovate together!

Latest

Study Reveals AI Chatbots, Including ChatGPT Rivals, Misidentified Election Information 90% of the Time

AI Chatbots’ Struggle with Accurate Election Information: A New...

Creating Multi-Tenant Agents Using Amazon Bedrock AgentCore

Architecting Multi-Tenant Agentic Applications with Amazon Bedrock AgentCore 1. Introduction...

A 25-Year Restaurant Veteran Relies on ChatGPT for Every Decision, Overlooking His Talented Team

The Rise of AI Psychosis: A Restaurant Owner's Over-Reliance...

China Enhances AI and Robotics Implementation in Greenhouse Vegetable Farming

Advancements in Intelligent Agriculture: Shouguang's Role in the Future...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Integrating AWS API MCP Server with Amazon QuickSight via Amazon Bedrock...

Streamline AWS Operations with Amazon Bedrock AgentCore Runtime and Model Context Protocol: A Comprehensive Guide Achieving Simplicity in Complex AWS Workflows As your AWS infrastructure scales,...

Amazon Nova Act Achieves HIPAA Eligibility

Unlocking Healthcare Workflows with Amazon Nova Act: A HIPAA-Eligible AI Solution Introduction Healthcare and life sciences (HCLS) organizations depend on repetitive, manual browser-based tasks for critical...

Introducing OpenAI-Compatible API Support for Amazon SageMaker AI Endpoints

Amazon SageMaker AI Unveils OpenAI-Compatible API Support for Real-Time Inference Overview Today, Amazon SageMaker AI introduces OpenAI-compatible API support, enabling seamless integration with real-time inference endpoints...