Key Highlights
- Alphabet introduces its eighth-generation tensor processing units with distinct chips for training (TPU 8t) and inference (TPU 8i)
- The inference-focused TPU 8i achieves 80% improvement in performance-per-dollar compared to the prior Ironwood generation
- Broadcom collaborated on development, while Google DeepMind contributed to the chip design process
- The TPU 8t training processor supports configurations up to 9,600 chips with double the interconnect bandwidth of Ironwood
- Google Cloud will make both processors accessible to customers in late 2025
Alphabet revealed a pair of custom-built AI processors on Wednesday, representing the first time the company has separated its tensor processing unit architecture into dedicated training and inference silicon.
The TPU 8t is engineered for AI model training workloads, whereas the TPU 8i focuses exclusively on inference operations — deploying trained models in real-world applications. Broadcom served as co-development partner, extending a collaboration that began more than ten years ago.
This represents a departure from previous design philosophy. Earlier TPU iterations combined training and inference capabilities within unified silicon. According to Google, the emergence of agentic AI systems — which operate autonomously in iterative cycles with minimal human oversight — necessitates purpose-built hardware.
“With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving,” said Amin Vahdat, Google’s SVP and chief technologist for AI and infrastructure.
The inference-oriented TPU 8i packs 384 megabytes of SRAM per processor — three times the capacity of Ironwood. Google claims this architectural enhancement eliminates latency accumulation when multiple concurrent users query a model simultaneously.
Substantial Performance Gains for Inference Workloads
Compared to Ironwood, the TPU 8i provides 80% superior performance per dollar spent. In operational terms, organizations can manage approximately double the query volume while maintaining identical budgets.
The chip also demonstrates up to 2x improvement in performance-per-watt through integrated power management systems that dynamically adjust energy consumption based on workload requirements.
Both processors now utilize Google’s Axion CPU as the host platform for the first time, enabling optimization at the system architecture level beyond individual chip enhancements.
For training applications, the TPU 8t superpod architecture supports deployments of 9,600 processors with 2 petabytes of high-bandwidth memory. Interconnect bandwidth doubles that of Ironwood, and Google indicates this infrastructure can compress frontier model development cycles from several months to a few weeks.
The training processor delivers 2.8 times the computational performance of seventh-generation Ironwood at equivalent pricing.
Early Adopters and Market Positioning
Commercial adoption is expanding. Citadel Securities developed quantitative research platforms using Google’s TPU infrastructure. The entire network of 17 U.S. Department of Energy national laboratories operates AI co-scientist applications on the processors. Anthropic has pledged to consume multiple gigawatts of Google TPU capacity.
DA Davidson analysts projected in September that the TPU division, combined with Google DeepMind operations, could represent approximately $900 billion in valuation.
Google maintains an exclusive cloud-only distribution model for TPUs — the chips are unavailable for direct purchase and accessible solely through Google Cloud infrastructure. Nvidia continues supplying GPU hardware to Google, and the company confirmed it will rank among the initial cloud providers delivering Nvidia’s forthcoming Vera Rubin platform later in 2025.
Google DeepMind participated directly in the chip design process, utilizing the processors for training Gemini language models and powering algorithms that drive Search and YouTube functionality.
Google confirmed both the TPU 8t and TPU 8i will enter general availability for cloud customers during the latter portion of this year.


