Monte Carlo RAG
How Monte Carlo search breaks the embedding ceiling
Monte Carlo RAG
Dexrag uses Monte Carlo Tree Search (MCTS) - the same algorithm behind AlphaGo - to intelligently explore documents and find the most relevant content.
The Embedding Problem
Traditional RAG systems use embeddings to find similar documents. But Google DeepMind research revealed fundamental limitations:
Mathematical Ceiling
- Fixed dimensions: 4096 dimensions can only represent ~250M unique documents
- Semantic collapse: Similar concepts get forced into nearby vectors
- Degrading recall: Performance drops as document count increases
- LIMIT benchmark: GPT-4 embeddings achieve <20% recall on complex queries
Static Training
- Internet-trained: Models trained on Reddit and Wikipedia
- Domain blind: Can't adapt to legal, medical, or technical terminology
- No learning: Performance never improves with usage
How Monte Carlo RAG Works
1. BM25 Baseline
Start with a proven text search algorithm (BM25) that doesn't require embeddings.
2. Monte Carlo Exploration
Instead of finding "similar" documents, MCTS:
- Explores document structure intelligently
- Evaluates multiple paths through the content
- Selects the most promising areas to investigate
- Expands understanding based on what's found
3. Document Structure Understanding
Dexrag analyzes:
- Headings and sections
- Tables and lists
- Citations and references
- Paragraph relationships
4. Adaptive Learning
After each query, the system learns:
- Which documents users found relevant
- Common search patterns
- Domain-specific terminology
- User preferences
Performance Comparison
| Metric | Embeddings | Dexrag Day 1 | Dexrag Day 30 |
|---|---|---|---|
| LIMIT Recall | 18% | 67% | 89% |
| Legal Docs | Baseline | +15% | +47% |
| Tech Docs | Baseline | +18% | +52% |
| Cold Start | Slow | Instant | Instant |
Why It's Better
Breaks Mathematical Limits
Monte Carlo search has no fixed dimensions - it can scale to billions of documents without degrading performance.
Learns Your Domain
Instead of being stuck with Reddit training data, Dexrag adapts to:
- Your industry terminology
- Your document structure
- Your users' needs
Explainable Results
Every search returns an exploration tree showing:
- Which paths were explored
- Why documents were selected
- How relevance was determined
No more black box similarity scores.
Technical Details
Search Algorithm
function monteCarloSearch(query, documents):
tree = initializeTree(documents)
for iteration in maxIterations:
node = selectPromisingNode(tree)
node = expandNode(node, query)
value = evaluateNode(node)
backpropagate(tree, node, value)
return bestPath(tree)Adaptive Learning
The system tracks:
- Click-through rate per document
- Time spent on each result
- Follow-up queries
- Explicit relevance feedback
This data updates the selection probabilities for future searches.
When to Use Monte Carlo RAG
Monte Carlo RAG excels at:
- Large document sets (>100K docs)
- Domain-specific content (legal, medical, technical)
- Structured documents (contracts, reports, papers)
- Repeated queries (customer support, internal wikis)
Traditional embeddings may still work for:
- Very small document sets (<100 docs)
- Generic content (news, blogs)
- One-time searches
But even then, Dexrag performs as well or better from query #1.