Skip to content

The RAG Flywheel

A Systematic Approach to Building Self-Improving AI Products

Practical frameworks for building RAG systems that improve through user feedback and measurement

Most RAG implementations struggle in production because teams focus on model selection and prompt engineering while overlooking the fundamentals: measurement, feedback, and systematic improvement.

This guide presents frameworks developed through real-world experience with companies like HubSpot, Zapier, and others to help you build RAG systems that become more valuable over time.

👉 If you want to learn more about RAG systems, check out our RAG Playbook course. Here is a 20% discount code for readers. 👈

RAG Playbook - 20% off for readers

Trusted by Leading Organizations

This methodology has been battle-tested by professionals at:

The Problem: Why Most RAG Systems Fail

Real Patterns from the Field

After working with dozens of companies, the failure pattern is predictable:

Week 1-2: "Our RAG demo is amazing!"

Week 3-4: "Why are users getting irrelevant results?"

Week 5-6: "Let's try a different model..."

Week 7-8: "Maybe we need better prompts..."

Week 9+: "Our users have stopped using it."

Sound familiar? You're not alone. The issue isn't your technology—it's your approach.

The Solution: The RAG Improvement Flywheel

Introduction: The Product Mindset Shift

The Foundation That Changes Everything

Stop thinking like an engineer. Start thinking like a product leader. Learn why treating RAG as a product rather than a project is the #1 predictor of success.

Key concepts: The improvement flywheel • Common failure patterns • Product thinking vs implementation thinking


Chapter 1: Starting the Data Flywheel

From Zero to Evaluation in Days, Not Months

The cold-start problem kills most RAG projects. Learn the synthetic data techniques that get you from zero to measurable improvement in days.

You'll build: Synthetic evaluation datasets • Precision/recall frameworks • Leading vs lagging metrics • Experiment velocity tracking

Case study: Legal tech company improved retrieval from 63% to 87% in 2 weeks using these techniques


Chapter 2: From Evaluation to Enhancement

Fine-Tuning That Actually Moves Business Metrics

Stop guessing which model to use. Learn how to systematically improve retrieval through fine-tuning, re-ranking, and targeted enhancements.

You'll implement: Embedding fine-tuning pipelines • Re-ranker integration (12-20% improvement) • Hard negative mining • A/B testing frameworks

Case study: E-commerce company increased revenue by $50M through systematic improvements


Chapter 3: User Experience and Feedback

5x Your Feedback Collection with One Simple Change

The secret to improvement? Getting users to tell you what's wrong. Learn the UX patterns that transform silent users into active contributors.

You'll master: High-converting feedback copy • Citation UX for trust • Implicit signal collection • Enterprise Slack integrations

Case study: Changing "How did we do?" to "Did we answer your question?" increased feedback 5x


Chapter 4: Understanding Your Users

Segmentation Strategies That Reveal Hidden Opportunities

Not all queries are equal. Learn to identify high-value user segments and build targeted solutions that delight specific audiences.

You'll discover: Query pattern analysis • User segmentation techniques • Priority matrices • Resource allocation frameworks

Case study: SaaS company found 20% of queries drove 80% of value, focused efforts accordingly


Chapter 5: Building Specialized Capabilities

Build Purpose-Built Retrievers That Users Love

One-size-fits-all RAG is dead. Learn to build specialized retrievers for documents, code, images, and structured data.

You'll create: Document-specific retrievers • Multi-modal search • Table/chart handlers • Domain-specific solutions

Case study: Construction blueprint search improved from 27% to 85% recall with specialized approach


Chapter 6: Unified Product Architecture

Unified Systems That Route Intelligently

Tie it all together with routing architectures that seamlessly direct queries to specialized components while maintaining a simple user experience.

You'll architect: Query routing systems • Tool selection frameworks • Performance monitoring • Continuous improvement pipelines

Case study: Enterprise system handling millions of queries with 95%+ routing accuracy


Conclusion: Product Principles for AI Applications

The Lessons That Survive Every Technology Shift

Models change. Principles endure. Take away the core insights that will guide your AI product development for years to come.

Learn from Industry Leaders: 20+ Expert Talks

Featured Lightning Lessons

Companies like Zapier, ChromaDB, LanceDB, Glean, and Sourcegraph share their battle-tested strategies

How Zapier 4x'd Their AI Feedback - Vitor (Staff Engineer, Zapier) reveals the one-line change that transformed their feedback collection

"Jason helped us set you on the right path... emphasis on looking at your data and building a metrics-based flywheel." - Vitor, Staff Software Engineer, Zapier

The 12% RAG Boost You're Missing - Ayush (LanceDB) shows why re-rankers are the "low-hanging fruit" everyone ignores

Why Cline Ditched RAG Entirely - Nik Pash explains why leading coding agents abandoned embeddings for direct exploration

The RAG Mistakes Killing Your AI - Skylar Payne exposes the anti-patterns that 90% of teams fall into

Stop Trusting MTEB Rankings - Kelly Hong reveals why public benchmarks fail in production

Explore all 20+ talks →

For Product Leaders, Engineers, and Data Scientists

What You'll Learn

For Product Leaders

  • How to establish metrics that align with business outcomes
  • Frameworks for prioritizing AI product improvements
  • Approaches to building product roadmaps for RAG applications
  • Methods for communicating AI improvements to stakeholders

For Engineers

  • Implementation patterns that facilitate rapid iteration
  • Architectural decisions that enable continuous improvement
  • Techniques for building modular, specialized capabilities
  • Approaches to technical debt management in AI systems

For Data Scientists

  • Methods for creating synthetic evaluation datasets
  • Techniques for segmenting and analyzing user queries
  • Frameworks for measuring retrieval effectiveness
  • Approaches to continuous learning from user interactions

About the Author

Jason Liu is a machine learning engineer with experience at Facebook and Stitch Fix, and has consulted for companies like HubSpot and Zapier on RAG implementations. His background includes computer vision, recommendation systems, and retrieval applications across various domains.

Stay Updated

👉 If you want to learn more about RAG systems, check out our RAG Playbook course. Here is a 20% discount code for readers. 👈

RAG Playbook - 20% off for readers