Elevator Pitch
I recently completed an interview process, including a coding exercise, for a popular job in tech, Generative AI Data Scientist. In this coding workshop, we will use this experience for the backdrop to step through how to build and evaluate a RAG pipeline in a fun and education way!
Description
In today’s rapidly evolving tech landscape, Generative AI is at the forefront of innovation, creating exciting career opportunities for data scientists and engineers. This hands-on workshop offers a unique and practical approach to mastering one of the most in-demand skills in the field: building and evaluating a Retrieval Augmented Generation (RAG) pipeline.
Drawing from real-world experience, the workshop instructor will guide participants through the exact coding exercise used in their recent interview process for a Generative AI Data Scientist position at a leading tech company. This approach provides an unparalleled opportunity to:
-
Gain practical, industry-relevant skills: Learn how to construct a RAG pipeline from scratch, mirroring the challenges faced in actual job interviews and real-world applications.
-
Understand evaluation techniques: Master the art of assessing and fine-tuning your RAG system using ragas, a critical skill for both landing a job and excelling in the field.
-
Peek behind the interview curtain: Get insider insights into what top companies are looking for in Generative AI talent, helping you prepare for your own career advancement.
-
Engage in a fun, collaborative environment: Work alongside peers to solve problems, share ideas, and build your professional network.
-
Bridge the gap between theory and practice: Apply your knowledge to a concrete, real-world scenario that goes beyond textbook examples.
Workshop Highlights
Throughout the workshop, participants will:
- Set up a RAG pipeline using popular open-source tools and libraries
- Learn best practices for data preparation and indexing
- Implement and fine-tune retrieval mechanisms
- Integrate retrieved information with large language models
- Develop robust evaluation metrics to assess pipeline performance
- Troubleshoot common issues and optimize system efficiency
Whether you’re a seasoned data scientist looking to pivot into Generative AI, a student preparing for future job prospects, or simply an enthusiast eager to explore this cutting-edge technology, this workshop offers invaluable experience that sets it apart from traditional conference sessions.
Don’t miss this chance to enhance your skills, boost your resume, and gain a competitive edge in the job market. Join us for an engaging, practical, and potentially career-changing workshop that will equip you with the tools to build, evaluate, and showcase your own RAG pipeline – a key to unlocking exciting opportunities in the world of Generative AI.
Technologies we will use: Python, Jupyter Notebook (Colab), Google Cloud Platform, Gemini, LangChain, ChromaDB, and more!
And just in case you were wondering, yes I did get the job!
Notes
Ideally, this would be at least two sessions in terms of time (i.e. 45 min X 2 -> 90 min). Having done this type of workshop in the past, I just think it is hard for attendees to grasp the concepts within a 45 min period. You tend to spend a lot of time up front getting everyone set up, and then you don’t have enough time to dive into the code with just 45 min. If that isn’t possible, I can make it work with 45 min, but I think the more time we can give the attendees, the better.
I have given similar code lab talks with GDG Ann Arbor, and we were able to get “credits” from Google that covered the costs of the workshop for participants. I am hoping to do the same here, since the technology we will be using (GCP, Gemini, Colab) is not free.