Productionizing AI and LLM Apps with Ray Serve


Once we've designed an AI/ML application and selected or trained the models, that's really just the beginning. We need our AI-powered services to be resilient and efficient, scalable to demand and adaptable to heterogeneous environments (like using GPUs or TPUs as effectively as possible). Moreover, when we build applications around online inference, we often need to integrate different services: multiple models, data sources, business logic, and more.

Ray Serve was built so that we can easily overcome all of those challenges.

In this class we'll learn to use Ray Serve to compose online inference applications meeting all of these requirements and more. We'll build services that integrate with each other while autoscaling individually, even supporting individual hardware and software requirements -- all using regular Python and often with just one new line of code.

Session Outline:

* Introduction to Ray and Ray Serve
* Running large language models (LLMs) with Ray Serve
* Building complex applications -- adding business logic and multiple models
* Ray Serve features and patterns for production deployment

* Learn about Ray and Ray Serve
* Develop an understanding of the various architectural components of Ray Serve.
* Use Ray to serve machine learning models in production environments for online inference
* Combine multiple models to build complex logic, allowing for a more sophisticated machine learning pipeline.

Background Knowledge:

Basic familiarity with Python and web / web-service applications


Adam Breindel is a member of the Anyscale training team and he consults and teaches on large-scale data engineering and AI/machine learning. He has served as technical reviewer for numerous O'Reilly titles covering Ray, Apache Spark, and other topics. Adam's 20 years of engineering experience include numerous startups and large enterprises with projects ranging from AI/ML systems and cluster management to web, mobile, and IoT apps. He holds a BA (Mathematics) from University of Chicago and a MA (Classics) from Brown University. Adam's interests include hiking, literature, and complex adaptive systems.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google