Technically Speaking
Technically Speaking | Scaling AI inference with open source

This video can't play due to privacy settings

To change your settings, select the "Cookie Preferences" link in the footer and opt in to "Advertising Cookies."

Scaling AI inference with open source ft. Brian Stevens

  |  Technically Speaking Team  
Artificial intelligence

How is artificial intelligence truly being reimagined for the real world, moving beyond labs and into critical business environments? This episode of "Technically Speaking" explores the pivotal shift towards production-quality AI inference at scale and how open source is spearheading this transformation. Red Hat CTO Chris Wright is joined by Brian Stevens, Red Hat's SVP and AI CTO, who shares his unique journey and insights. They discuss the fascinating parallels between standardizing Linux decades ago and the current mission to create a common, efficient stack for AI inference. The conversation delves into the practicalities of making AI work, the evolution to GPU-focused inference with projects like vLLM, the complexities of model optimization, and why a collaborative open source ecosystem is crucial for realizing the full potential of enterprise AI.

Transcript

About the show

Technically Speaking

What’s next for enterprise IT? No one has all the answers—But CTO Chris Wright knows the tech experts and industry leaders who are working on them.