Abstract: The boom in AI has led to an exponential rise in compute demand, with data scientists taking to the cloud to experiment and scale their model development. Desires for cost-efficiency, increased performance, and security, have all led to the search for alternative solutions for differing needs, including bringing intensive AI workloads back to local data centers.
For those organizations that have elected to build on-premises, what are the building blocks for the compute, storage, and networking components? What considerations need to be taken when building a homegrown software stack? What MLOps platforms exist and what makes a good solution for multi-tenant teams of data scientists?
Join this session to learn about NVIDIA’s take on how we build AI Infra in-house and what our advice is for organizations looking to replicate our experience.
Bio: Michael Balint is a Senior Product Manager at NVIDIA focused on cluster management, orchestration, and scheduling of NVIDIA DGX servers. Prior to working at NVIDIA, Michael was a White House Presidential Innovation Fellow, where he brought his technical expertise to projects like VP Biden's Cancer Moonshot and Code.gov. A graduate of both Cornell and Johns Hopkins University, he has had the good fortune of applying software engineering and data science to many interesting problems throughout his career, including: optimization of air traffic flows for the FAA, NLP summarization of makeup reviews, and repurposing geospatial anomaly detection to the discovery of abnormal skin lesions.