How to Make LLMs Fit into Commodity Hardware Again: A Practical Guide


LLMs like ChatGPT are all the hype. Using them as they are or as the key part of a RAG (Retrieval-Augmented Generation) system stretches the limits of what is possible in software development today.

Unfortunately, those models typically run in the cloud either because vendors just don’t want to share their models or because there simply is no hardware you could buy in large numbers to make them run in the first place. There are, however, reasons why you would want an LLM to run on machines managed by yourself:
- Cost of operation
- Privacy / data protection
- Latency
- Full control of availability and scaling

In this hands-on workshop we will show different approaches on how to make powerful LLMs fit onto affordable GPUs (like a T4) or - in special cases - even make them run on CPU. We will round this up by showing you how to evaluate and compare the performance of these small LLMs.

We bring all examples for you to follow along as notebooks on Google Colab. So all you need is a laptop and a browser.


Christian lives in Zurich, Switzerland and works as a Consultant focusing on real-world machine learning applications. . He earned his PhD in mathematics from ETH Zurich and completed a postdoctoral fellowship at the International Computer Science Institute in Berkeley. Christian has been developing and architecting IT solutions for the last 30 years. Currently he’s applying artificial intelligence to Geberit’s planning software ProPlanner.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google