Google Discloses TPUv4 Details

Google’s TPUv4 excels at AI models employing embeddings owing to its sea of SparseCores that supplement its two main cores. Targeting inference, the TPUv4i has only a single larger core to reduce power.

Joseph Byrne

Flowers are a sign of spring, and Google’s TPUv4 disclosure is a sign that it’s replacing the chip with its successor. A recent paper, to be presented in June at the International Symposium on Computer Architecture (ICSA), sheds light on how the company developed the AI processor and the supercomputer based on it.

For the fourth generation, Google developed two 7 nm chips: the TPUv4i for inference and the TPUv4 for training. The big VLIW TensorCores these chips employ have more matrix units than those in the TPUv3 and the new AI accelerators add a large common memory. The main difference between the TPUv4i and the TPUv4 is that the latter integrates two TensorCores like the TPUv2 and TPUv3, whereas the TPUv4i implements only one to enable air cooling. In contrast to competing inference-focused accelerators that emphasize INT8 throughput, Google sees accuracy benefits from eschewing quantization and sticking with the same floating-point formats for inference as for training.

Google’s ICSA paper also discusses the TPU’s SparseCores. First included in the TPUv2, these engines have proven more useful as Google has used them to process recommendation and language models that employ embeddings (vectors representing items such as words in a text block or videos watched). They’re simpler than the main VLIW core, enabling a TPU to instantiate many of them in a sea of parallel cores. The company reports that SparseCores accelerate models that employ them by 5x–7x but use only 5% of a chip’s area and power.

Google hasn’t yet disclosed information about the TPUv5. The recent paper alludes to it and hints that it’s a 4 nm chip first deployed in 2023, three years after the TPUv4. By contrast, the TPUv4, TPUv3, and TPUv2 each followed its predecessor by only one year. The long life of TPUv4 demonstrates it was flexible enough to adapt to the company’s evolving workload and left little room to improve performance, efficiency, and scalability.

Free Newsletter

Get the latest analysis of new developments in semiconductor market and research analysis.

Subscribers can view the full article in the TechInsights Platform.

Subscriber Login

You must be a subscriber to access the Manufacturing Analysis reports & services.

If you are not a subscriber, you should be! Enter your email below to contact us about access.

Manufacturing Analysis

Subscriber Login

Analysis Insights

December 18, 2025

Huawei Mate 80 Pro Max Teardown Confirms Kirin 9030 Pro on SMIC N+3

TechInsights’ teardown of the Huawei Mate 80 Pro Max confirms the Kirin 9030 Pro on SMIC’s N+3 process and reveals major upgrades in display, cameras, and connectivity.

Learn More

December 17, 2025

2026 Advanced Packaging Outlook Report

Discover the five expectations defining advanced packaging in 2026, including CPO adoption, HBM4 demand, panel and glass scaling, 3D thermal challenges, and chiplets for mobile.

Learn More

December 16, 2025

Trump Approves H200 Exports: Impact on NVIDIA, TSMC & HBM

Trump’s approval of H200 exports to China could reshape GPU, packaging, and HBM markets. TechInsights analyzes supply constraints, pricing, and $40–50B demand.

Learn More