Google Making Major Changes in AI Operations to Pull in Cash from Gemini

Over the last week, Google has made some under-the-radar changes, including appointing a new leader for AI development, which suggests the company is taking its AI operations in a new direction.

Google has also put a price on using its Gemini API and cut off most of its free access to its APIs. The message is clear: the party is over for developers looking for AI freebies, and Google wants to make money off AI tools such as Gemini.

Google provided developers free access to its older and newer APIs to its LLMs. Free access was an attempt to woo developers to adopt its AI products.

Gemini is mainly known to customers through the company’s chatbot interface. However, many developers are creating custom chatbots with Gemini. The APIs funnel the questions and retrieve answers from Gemini, which are then delivered to users in the custom interface.

Specifically, Google is shutting off access to its PaLM (the older pre-Gemini LLM) API via its AI Studio. It is also dumping unfettered API access to the more advanced Gemini Pro with a new paid plan that limits free usage. Basically, all API roads end at Gemini 1.0 Pro, around which Google is consolidating its developer activities.

Other moves hint at big changes in Google’s AI plans.

This week, Google hired Logan Kilpatrick to lead the AI Studio and Gemini API operations. Kilpatrick comes from OpenAI, which he joined in December 2022, where he led developer advocacy. At OpenAI, he “helped scale … dev platform to millions of developers,” according to his LinkedIn profile.

Kilpatrick will now do the same for Google and its Gemini AI platform.

Google has trailed OpenAI on chatbots and is also catching up on attracting developers to use its AI platforms.

Many companies have opted for OpenAI APIs as it was the first to market. OpenAI now charges customers for its APIs and access to its large language models.

For example, the OpenAI’s API is available as part of Windows PCs with Intel’s Core Ultra chips, where an API connects users to OpenAI to get answers to questions. Security companies are integrating ChatGPT into their software products. Companies like Glean are integrating OpenAI in their enterprise search offerings.

Google is attracting developers via its cloud service and AI Studio service. For now, developers can get free API keys on Google’s website, which provides access to Google’s LLMs through a chatbot interface. Developers and users have until now enjoyed free access to Google’s LLMs, but that is also ending.

This week, Google threw a double whammy that effectively shuts down free access to its APIs via AI studio.

In an email to developers earlier this week, Google said it was shutting down access to its PaLM API (the pre-Gemini model) to developers via AI Studio on August 15. Developers had free access to the PaLM API, which was used to build custom chatbots.

“You’ll be able to prompt, tune, and run inference using the Google AI PaLM API until August 15, 2024,” Google said in an email dated March 29 to developers.

PaLM API Notice (Source Google)

“We encourage testing prompts, tuning, inference, and other features with stable Gemini 1.0 Pro to avoid interruptions. You can use the same API key you used for the PaLM API to access Gemini models through Google AI SDKs,” Google said.

Google also announced this week that it is restricting API access to its Google Gemini model in a bid to turn free users into paid customers. Free access to Gemini allowed many companies to offer chatbots based on the LLM for free, but Google’s changes will likely mean many of those chatbots will shut down.

“Pay-as-you-go pricing for the Gemini API will be introduced,” Google said in an email on Monday to developers.

“If you use Gemini API from a project that has billing disabled, you can still use Gemini API free of charge, without the benefits available in our paid plan,” Google said in the email.

Gemini cost notice (Source Google)

The free plan includes two requests per minute, 32,000 tokens per minute, and a maximum of 50 requests per day. However, one drawback is that Google will use chatbot responses to improve its products, which purportedly include its LLMs.

The paid plan includes five requests per minute, 10 million tokens per minute, and 2,000 requests per day. The preview pricing is $7 for input of 1 million tokens, or $21 for output of 1 million tokens. The prompts/responses in the paid model will not be used by Google to improve their products.

There is one exception: PaLM and Gemini will remain accessible to customers already paying for Vertex AI in Google Cloud. Regular developers on cheaper budgets typically use AI Studio as they cannot afford Vertex.

Google’s APIs use the hardware hosted in the company’s data centers to deliver answers to customers. Gemini runs on TPUs, which handle training and inferencing.

Google has committed billions to build new data centers, most recently $1 billion for a data center in the UK.

The hundreds of billions spent in data centers to run AI is a gamble, as the companies do not have proven AI revenue models. As the use of the LLMs grows, small revenue streams through offerings like APIs could contribute to the cost of building the hardware and data centers.

Other AI companies are spending billions to establish new data centers and are also looking to generate revenue from AI to foot the bills.

Bloomberg recently reported that Amazon was spending $150 billion over 15 years to establish new data centers.

OpenAI and Microsoft plan to spend $100 billion on a supercomputer called Stargate, according to The Information.

For customers unwilling to pay, Google has released the Gemma large language models, around which customers can build their own AI applications. Other open-source models, such as Mixtral, are also gaining in popularity. Meta CEO Mark Zuckerberg has hyped up an upcoming LLM called Llama 3 as a foundation that will drive down the cost of adopting AI.

Customers are leaning toward open-source LLMs as the cost of AI grows. These models can be downloaded and run on custom hardware that is tuned to run the applications, but most can’t afford the hardware, which in most cases is Nvidia’s GPUs. AI hardware is also not easily available off the shelf.

Originally Appeared Here

Author: Rayne Chancer