As enterprises continue to pilot generative AI projects, many are finding the cost of rolling out the tech in their own data centers can be prohibitive. ISVs like Amazon see a future in offering those AI instances over the web.
While generative artificial intelligence (genAI) models are expected to shrink down in size to fit more defined needs and corporate budgets, a large number of service providers are still plotting their revenue course based on delivering AI cloud services.
In his annual letter to shareholders last week, Amazon CEO Andy Jassy said the company will focus less on building consumer-facing genAI applications and more on delivering AI models it can sell via web services to enterprise customers.
“Sometimes, people ask us, ‘What’s your next pillar? You have Marketplace, Prime, and AWS, what’s next?,’” Jassy wrote. “If you asked me today, I’d lead with generative AI. We’re optimistic that much of this world-changing AI will be built on top of AWS.”
Jassy’s expectations for revenue streams from AI services are not misplaced. Organizations plan to invest 10% to 15% more on AI initiatives over the next year and a half compared to calendar year 2022, according to an IDC survey of more than 2,000 IT and line-of-business decision makers.
Last fall, Amazon launched Bedrock, which delivers a variety of large language models (LLMs) via the AWS cloud through which organizations can build genAI applications. The company also recently launched Amazon Q, a cloud-based AI-assisted software coding assistant.
Amazon’s Bedrock offers AI “foundational models” from AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, along with Amazon’s own LLM via a single API.
Amazon’s list of AI cloud clients now includes ADP, Delta Air Lines, GoDaddy, Intuit, Pfizer, and Siemens.
Currently, cloud computing leads all other methods for delivering genAI applications to enterprises; that’s because of the high cost of building out proprietary infrastructure. Amazon Web Services, Google, IBM, Microsoft and Oracle have invested billions of dollars in AI cloud offerings since OpenAI set off a firestorm of adoption with the launch of ChatGPT in November 2022.
“No one but the hyperscalers and mega large companies can afford to train and operate the very large LLMs and foundation models,” said Avivah Litan, Gartner distinguished vice president analyst. “The costs are in the hundreds of millions of dollars.”
By “large” Litan was referring to models with hundreds of billions of parameters, as opposed, to say, those with fewer than 100 billion parameters. The costs to use LLMs supplied over cloud services, however, “are relatively manageable by enterprises and for now are also subsidized by the hyperscalers,” Litan said.
However, as enterprises continue to grow their pilots of genAI applications, the cost of cloud services can become a limiting factor. Instead, many organizations are looking to deploy smaller, on-premises LLMs aimed at handling specific tasks.
Smaller domain-specific models trained on more data will eventually challenge the dominance of today’s leading LLMs, including OpenAI’s GPT 4, Meta AI’s LLaMA 2, and Google’s PaLM 2. Smaller models would also be easier to train for specific use cases, according to Dan Diasio, Ernst & Young’s Global Artificial Intelligence Consulting Leader.
Through 2025, 30% of genAI projects will be abandoned after proof of concept (POC) due to poor data quality, inadequate risk controls, escalating costs, or unclear business value, according to Gartner Research. And by 2028, more than half of enterprises that have built their own LLMs from scratch will abandon their efforts due to costs, complexity and technical debt in their deployments.
Current vendor pricing models that pass on the high cost of innovation and developing, training and running LLMs could also mean enterprises won’t see ROI for their AI projects, according to a recent report by Gartner. Even when pricing is subsidized by vendors hoping to gain early market share, it’s often not enough to produce a quick payback, Gartner said. Instead, organizations should take the long approach to productivity gains and ROI from genAI.
Lee Sustar, a principal analyst at Forrester Research, said AI services via cloud will continue to grow as products such as AWS Bedrock, Azure AI and Google Cloud Vertex lower the barrier to entry.
“Given the data gravity in the cloud, it is often the easiest place to start with training data. However, there will be a lot of use cases for smaller LLMs and AI inferencing at the edge. Also, cloud providers will continue to offer build-your-own AI platform options via Kubernetes platforms, which have been used by data scientist for years now,” Sustar said. “Some of these implementations will take place in the data center on platforms such as Red Hat OpenShift AI. Meanwhile, new GPU-oriented clouds like Coreweave will offer a third option. This is early days, but managed AI services from cloud providers will remain central to the AI ecosystem.”
And while smaller LLMs are on the horizon, enterprises will still use major companies’ AI cloud services for when they need access to very large LLMs, according to Litan. Even so, more organizations will eventually be using small LLMs that run on much smaller hardware, “even as small as a common laptop.
“And we will see the rise of services companies that support that configuration along with the privacy, security and risk management services that will be required,” Litan said. “There will be plenty of room for both models — the very large foundation model cloud service delivery and the small foundation model private cloud service delivery on your GPU/CPU of choice.”
One of Amazon’s earliest AI-cloud services was Sagemaker, an integrated development environment (IDE) for developers and engineers to build, train, and deploy machine learning and AI models.
“Bedrock is off to a very strong start with tens of thousands of active customers after just a few months,” Jassy wrote. “Unlike the mass modernization of on-premises infrastructure to the cloud…, this genAI revolution will be built from the start on top of the cloud.”