Foreign media commentators believe that the focus of competition in generative AI is shifting from "whose model is stronger" to "who can run the model cheaper and faster." With enterprises deploying AI agents on a large scale, inference costs continue to rise, and Google is attempting to make this its main battleground.
Businesses begin recalculating bills
Google CEO Sundar Pichai recently stated that many companies were nearing the end of their annual token budget by May. He claimed that if Google Cloud's top customers shifted 80% of their AI workloads to a combination of Gemini 3.5 Flash and other cutting-edge models, they could save over $1 billion annually.
The article states that as AI agents handle longer processes and invoke more context, businesses are becoming increasingly sensitive to billing. Uber's COO previously mentioned that the ever-increasing costs of AI are becoming harder to justify. Investor Chamath Palihapitiya also stated in March that his company, 8090, was reducing its use of Cursor due to excessive token spending.
Google bets on infrastructure

According to foreign media, as the performance gap between models narrows, competition will increasingly focus on infrastructure and inference efficiency. Google's advantage lies in its relatively complete technology stack, including chips, data centers, cloud platforms, models, and upper-layer applications.
William Blair analysts estimated this month that Google's internal AI computing costs are about 50% lower than its competitors, and in some cases, as low as 75%, thanks to its self-developed TPU chips and direct procurement of components from manufacturers. In contrast, OpenAI has to pay infrastructure fees to cloud service providers such as Microsoft and Oracle, who in turn continue to bear the costs of GPUs.
Search-era tactics reappear
The article argues that Google's current strategy is similar to its early expansion strategy in the search business. In 2006, Google's search market share exceeded 40%. At that time, it not only relied on the quality of search results to attract users, but also continuously emphasized response speed and lower service costs.
Instead of relying solely on expensive servers, Google built a custom system using cheaper general-purpose hardware to improve speed and reduce costs. As search volume increased, more data in turn optimized search results, creating a positive feedback loop.
Foreign media outlets have concluded that Google doesn't necessarily need to be absolutely leading in every model capability. If it can make its price and speed advantages more pronounced while ensuring its models are "good enough," it will have the opportunity to expand its market share in the enterprise AI market.









