Monte Carlo RAG

How Monte Carlo search breaks the embedding ceiling

Monte Carlo RAG

Dexrag uses Monte Carlo Tree Search (MCTS) - the same algorithm behind AlphaGo - to intelligently explore documents and find the most relevant content.

The Embedding Problem

Traditional RAG systems use embeddings to find similar documents. But Google DeepMind research revealed fundamental limitations:

Mathematical Ceiling

Fixed dimensions: 4096 dimensions can only represent ~250M unique documents
Semantic collapse: Similar concepts get forced into nearby vectors
Degrading recall: Performance drops as document count increases
LIMIT benchmark: GPT-4 embeddings achieve <20% recall on complex queries

Static Training

Internet-trained: Models trained on Reddit and Wikipedia
Domain blind: Can't adapt to legal, medical, or technical terminology
No learning: Performance never improves with usage

How Monte Carlo RAG Works

1. BM25 Baseline

Start with a proven text search algorithm (BM25) that doesn't require embeddings.

2. Monte Carlo Exploration

Instead of finding "similar" documents, MCTS:

Explores document structure intelligently
Evaluates multiple paths through the content
Selects the most promising areas to investigate
Expands understanding based on what's found

3. Document Structure Understanding

Dexrag analyzes:

Headings and sections
Tables and lists
Citations and references
Paragraph relationships

4. Adaptive Learning

After each query, the system learns:

Which documents users found relevant
Common search patterns
Domain-specific terminology
User preferences

Performance Comparison

Metric	Embeddings	Dexrag Day 1	Dexrag Day 30
LIMIT Recall	18%	67%	89%
Legal Docs	Baseline	+15%	+47%
Tech Docs	Baseline	+18%	+52%
Cold Start	Slow	Instant	Instant

Why It's Better

Breaks Mathematical Limits

Monte Carlo search has no fixed dimensions - it can scale to billions of documents without degrading performance.

Learns Your Domain

Instead of being stuck with Reddit training data, Dexrag adapts to:

Your industry terminology
Your document structure
Your users' needs

Explainable Results

Every search returns an exploration tree showing:

Which paths were explored
Why documents were selected
How relevance was determined

No more black box similarity scores.

Technical Details

Search Algorithm

function monteCarloSearch(query, documents):
  tree = initializeTree(documents)

  for iteration in maxIterations:
    node = selectPromisingNode(tree)
    node = expandNode(node, query)
    value = evaluateNode(node)
    backpropagate(tree, node, value)

  return bestPath(tree)

Adaptive Learning

The system tracks:

Click-through rate per document
Time spent on each result
Follow-up queries
Explicit relevance feedback

This data updates the selection probabilities for future searches.

When to Use Monte Carlo RAG

Monte Carlo RAG excels at:

Large document sets (>100K docs)
Domain-specific content (legal, medical, technical)
Structured documents (contracts, reports, papers)
Repeated queries (customer support, internal wikis)

Traditional embeddings may still work for:

Very small document sets (<100 docs)
Generic content (news, blogs)
One-time searches

But even then, Dexrag performs as well or better from query #1.

Monte Carlo RAG

On this page