Retrieval Augmented Generation (RAG) improves large language model (LLM) responses by retrieving relevant data from knowledge bases—often private, recent, or domain-specific—and using it to generate more accurate, grounded answers.
In this course, you’ll learn how to build RAG systems that connect LLMs to external data sources. You’ll explore core components like retrievers, vector databases, and language models, and apply key techniques at both the component and system level. Through hands-on work with real production tools, you’ll gain the skills to design, refine, and evaluate reliable RAG pipelines—and adapt to new methods as the field advances. Across five modules, you'll complete hands-on programming assignments that guide you through building each core part of a RAG system, from simple prototypes to production-ready components. Through hands-on labs, you’ll: - Build your first RAG system by writing retrieval and prompt augmentation functions and passing structured input into an LLM. - Implement and compare retrieval methods like semantic search, BM25, and Reciprocal Rank Fusion to see how each impacts LLM responses. - Scale your RAG system using Weaviate and a real news dataset—chunking, indexing, and retrieving documents with a vector database. - Develop a domain-specific chatbot for a fictional clothing store that answers FAQs and provides product suggestions based on a custom dataset. - Improve chatbot reliability by handling real-world challenges like dynamic pricing and logging user interactions for monitoring and debugging. You’ll apply your skills using real-world data from domains like media, healthcare, and e-commerce. By the end of the course, you’ll combine everything you’ve learned to implement a fully functional, more advanced RAG system tailored to your project’s needs.