

Enterprises looking to push AI infrastructure from lab experiments to production-ready solutions face a familiar bottleneck: Data access, system compatibility and performance at scale.
Red Hat Inc. is tackling this challenge head-on through collaborative efforts with hardware and chip partners to optimize artificial intelligence and memory technologies in real-world enterprise environments.
Red Hat’s Stephen Watt talks with theCUBE about the company’s open-source innovation.
“I think it all starts with large language models,” said Stephen Watt (pictured), vice president and distinguished engineer, Office of the CTO, at Red Hat. “I think we had this sort of era of predictive AI, and now with generative AI, I think there’s a whole lot of … new applications … and … interesting new use cases in three different areas: training, fine-tuning and inference. Last year, we announced the InstructLab, which was democratizing fine-tuning models. With our Neural Magic acquisition, we’ve got a lot more into inference, and that’s about serving models and creating value for applications in the enterprise.”
Watt spoke with theCUBE Research’s Rob Strechay and theCUBE’s host Rebecca Knight at Red Hat Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed Red Hat’s evolving AI infrastructure strategy and open-source innovation. (* Disclosure below.)
Red Hat is expanding its AI strategy by integrating open-source tools that enhance model context and task specificity. By combining retrieval-augmented generation with fine-tuning techniques and high-performance inference frameworks, such as virtualized large language models, the company aims to ground large language models in both data and real-world operations, according to Watt.
“I would say it’s all about context,” he said. “There’s retrieval augmentation, RAG, and then RAFT, which is applying RAG with fine-tuning. We’ve got an emerging story around that with the upstream Llama Stack project, where we’ve just done a lot of work upstream to enable all of that.”
As AI scales across edge, data center and cloud environments, Red Hat is leaning into its distributed systems pedigree to tame inference sprawl. The company is prioritizing engineering strategies that make institutional knowledge more accessible to AI models and exploring new architectural patterns for seamless model integration, according to Watt.
“I think there’s two specific areas going back into context again,” he said. “One is vector databases. You take those [extract, transform, load] pipelines and you chunk all your documents back into those vector databases. Once you do that, you’re able to basically take what you institutionally know within your organization [and] add it into something that’s accessible from the large language model. The second thing that’s really interesting is the evolution of service-oriented architectures … to hook those into the large language model — I think those two things together are really exciting.”
Open-source leadership remains foundational to Red Hat’s approach to AI infrastructure. As innovation accelerates across model development and AI deployment, the company continues to invest in upstream communities that support transparency, trust and long-term viability for applied AI solutions, according to Watt.
“The rate of innovation, new projects, the creative destruction that’s happening … our role is to basically create a steady pipeline that businesses can use to consume software where it’s stabilized and safe,” Watt added.
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Red Hat Summit:
(* Disclosure: Red Hat Inc. sponsored this segment of theCUBE. Neither Red Hat nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
THANK YOU