Tensor Processing Unit
Tensor Processing Unit 3.0
Designer	Google
Introduced	2015^[1]
Type	Neural network Machine learning

Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for neural network machine learning, using Google's own TensorFlow software.^[2] Google began using TPUs internally in 2015, and in 2018 made them available for third-party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale.

Comparison to CPUs and GPUs

Compared to a graphics processing unit, TPUs are designed for a high volume of low precision computation (e.g. as little as 8-bit precision)^[3] with more input/output operations per joule, without hardware for rasterisation/texture mapping.^[4] The TPU ASICs are mounted in a heatsink assembly, which can fit in a hard drive slot within a data center rack, according to Norman Jouppi.^[5]

Different types of processors are suited for different types of machine learning models. TPUs are well suited for CNNs, while GPUs have benefits for some fully-connected neural networks, and CPUs can have advantages for RNNs.^[6]

History

The tensor processing unit was announced in May 2016 at Google I/O, when the company said that the TPU had already been used inside their data centers for over a year.^[5]^[4] The chip has been specifically designed for Google's TensorFlow framework, a symbolic math library which is used for machine learning applications such as neural networks.^[7] However, as of 2017 Google still used CPUs and GPUs for other types of machine learning.^[5] Other AI accelerator designs are appearing from other vendors also and are aimed at embedded and robotics markets.

Google's TPUs are proprietary. Some models are commercially available, and on February 12, 2018, The New York Times reported that Google "would allow other companies to buy access to those chips through its cloud-computing service."^[8] Google has said that they were used in the AlphaGo versus Lee Sedol series of man-machine Go games,^[4] as well as in the AlphaZero system, which produced Chess, Shogi and Go playing programs from the game rules alone and went on to beat the leading programs in those games.^[9] Google has also used TPUs for Google Street View text processing and was able to find all the text in the Street View database in less than five days. In Google Photos, an individual TPU can process over 100 million photos a day.^[5] It is also used in RankBrain which Google uses to provide search results.^[10]

Google provides third parties access to TPUs through its Cloud TPU service as part of the Google Cloud Platform^[11] and through its notebook-based services Kaggle and Colaboratory.^[12]^[13]

Products

Tensor Processing Unit products^[14]^[15]^[16]
	TPUv1	TPUv2	TPUv3	TPUv4^[15]^[17]	TPUv5e^[18]	TPUv5p^[19] ^[20]	Trillium^[21]
Date introduced	2015	2017	2018	2021	2023	2023	2024
Process node	28 nm	16 nm	16 nm	7 nm	Unstated	Unstated
Die size (mm²)	331	< 625	< 700	< 400	300-350	Unstated
On-chip memory (MiB)	28	32	32	32	48	112
Clock speed (MHz)	700	700	940	1050	Unstated	1750
Memory	8 GiB DDR3	16 GiB HBM	32 GiB HBM	32 GiB HBM	16 GB HBM	95 GB HBM	32 GB ?
Memory bandwidth	34 GB/s	600 GB/s	900 GB/s	1200 GB/s	819 GB/s	2765 GB/s	~1.6 TB/s ?
TDP (W)	75	280	220	170	Not Listed	Not Listed
TOPS (Tera Operations Per Second)	23	45	123	275	197 (bf16) 393 (int8)	459 (bf16) 918 (int8)
TOPS/W	0.31	0.16	0.56	1.62	Not Listed	Not Listed

First generation TPU

The first-generation TPU is an 8-bit matrix multiplication engine, driven with CISC instructions by the host processor across a PCIe 3.0 bus. It is manufactured on a 28 nm process with a die size ≤ 331 mm². The clock speed is 700 MHz and it has a thermal design power of 28–40 W. It has 28 MiB of on chip memory, and 4 MiB of 32-bit accumulators taking the results of a 256×256 systolic array of 8-bit multipliers.^[22] Within the TPU package is 8 GiB of dual-channel 2133 MHz DDR3 SDRAM offering 34 GB/s of bandwidth.^[16] Instructions transfer data to or from the host, perform matrix multiplications or convolutions, and apply activation functions.^[22]

Second generation TPU

The second-generation TPU was announced in May 2017.^[23] Google stated the first-generation TPU design was limited by memory bandwidth and using 16 GB of High Bandwidth Memory in the second-generation design increased bandwidth to 600 GB/s and performance to 45 teraFLOPS.^[16] The TPUs are then arranged into four-chip modules with a performance of 180 teraFLOPS.^[23] Then 64 of these modules are assembled into 256-chip pods with 11.5 petaFLOPS of performance.^[23] Notably, while the first-generation TPUs were limited to integers, the second-generation TPUs can also calculate in floating point, introducing the bfloat16 format invented by Google Brain. This makes the second-generation TPUs useful for both training and inference of machine learning models. Google has stated these second-generation TPUs will be available on the Google Compute Engine for use in TensorFlow applications.^[24]

Third generation TPU

The third-generation TPU was announced on May 8, 2018.^[25] Google announced that processors themselves are twice as powerful as the second-generation TPUs, and would be deployed in pods with four times as many chips as the preceding generation.^[26]^[27] This results in an 8-fold increase in performance per pod (with up to 1,024 chips per pod) compared to the second-generation TPU deployment.

Fourth generation TPU

On May 18, 2021, Google CEO Sundar Pichai spoke about TPU v4 Tensor Processing Units during his keynote at the Google I/O virtual conference. TPU v4 improved performance by more than 2x over TPU v3 chips. Pichai said "A single v4 pod contains 4,096 v4 chips, and each pod has 10x the interconnect bandwidth per chip at scale, compared to any other networking technology.”^[28] An April 2023 paper by Google claims TPU v4 is 5-87% faster than A100 at machine learning benchmarks.^[29]

There is also an "inference" version, called v4i,^[30] that does not require liquid cooling.^[31]

Fifth generation TPU

In 2021, Google revealed that the physical layout of TPU v5 is being performed by a novel application of deep reinforcement learning.^[32] Google claims TPU v5 as being nearly twice as fast as TPU v4,^[33] and based on that and the relative performance of TPU v4 over A100, some speculate TPU v5 as being as fast as or faster than H100.^[34]

Similar to the v4i being a lighter-weight version of the v4, the fifth generation has a "cost-efficient"^[35] version called v5e.^[18] In December 2023, Google announced TPU v5p which is claimed to be competitive with the H100.^[36]

Sixth generation TPU

In May, 2024 at the Google I/O conference, Google announced the TPU v6, available later in 2024. Google claimed 4.7 times performance increase relative to TPU v5e,^[37] through larger matrix multiply units and increased clock speed. High Bandwidth Memory (HBM) capacity and bandwidth are also doubled. A pod can contain up to 256 Trillium units.^[38]

Edge TPU

In July 2018, Google announced the Edge TPU. The Edge TPU is Google's purpose-built ASIC chip designed to run machine learning (ML) models for edge computing, meaning it is much smaller and consumes far less power compared to the TPUs hosted in Google datacenters (also known as Cloud TPUs^[39]). In January 2019, Google made the Edge TPU available to developers with a line of products under the Coral brand. The Edge TPU is capable of 4 trillion operations per second with 2 W of electrical power.^[40]

The product offerings include a single-board computer (SBC), a system on module (SoM), a USB accessory, a mini PCI-e card, and an M.2 card. The SBC Coral Dev Board and Coral SoM both run Mendel Linux OS – a derivative of Debian.^[41]^[42] The USB, PCI-e, and M.2 products function as add-ons to existing computer systems, and support Debian-based Linux systems on x86-64 and ARM64 hosts (including Raspberry Pi).

The machine learning runtime used to execute models on the Edge TPU is based on TensorFlow Lite.^[43] The Edge TPU is only capable of accelerating forward-pass operations, which means it's primarily useful for performing inferences (although it is possible to perform lightweight transfer learning on the Edge TPU^[44]). The Edge TPU also only supports 8-bit math, meaning that for a network to be compatible with the Edge TPU, it needs to either be trained using the TensorFlow quantization-aware training technique, or since late 2019 it's also possible to use post-training quantization.

On November 12, 2019, Asus announced a pair of single-board computer (SBCs) featuring the Edge TPU. The Asus Tinker Edge T and Tinker Edge R Board designed for IoT and edge AI. The SBCs officially support Android and Debian operating systems.^[45]^[46] ASUS has also demonstrated a mini PC called Asus PN60T featuring the Edge TPU.^[47]

On January 2, 2020, Google announced the Coral Accelerator Module and Coral Dev Board Mini, to be demonstrated at CES 2020 later the same month. The Coral Accelerator Module is a multi-chip module featuring the Edge TPU, PCIe and USB interfaces for easier integration. The Coral Dev Board Mini is a smaller SBC featuring the Coral Accelerator Module and MediaTek 8167s SoC.^[48]^[49]

Pixel Neural Core

Main article: Pixel Neural Core

On October 15, 2019, Google announced the Pixel 4 smartphone, which contains an Edge TPU called the Pixel Neural Core. Google describe it as "customized to meet the requirements of key camera features in Pixel 4", using a neural network search that sacrifices some accuracy in favor of minimizing latency and power use.^[50]

Google Tensor

Main article: Google Tensor

Google followed the Pixel Neural Core by integrating an Edge TPU into a custom system-on-chip named Google Tensor, which was released in 2021 with the Pixel 6 line of smartphones.^[51] The Google Tensor SoC demonstrated "extremely large performance advantages over the competition" in machine learning-focused benchmarks; although instantaneous power consumption also was relatively high, the improved performance meant less energy was consumed due to shorter periods requiring peak performance.^[52]

Lawsuit

In 2019, Singular Computing, founded in 2009 by Joseph Bates, a visiting professor at MIT,^[53] filed suit against Google alleging patent infringement in TPU chips.^[54] By 2020, Google had successfully lowered the number of claims the court would consider to just two: claim 53 of US 8407273 filed in 2012 and claim 7 of US 9218156 filed in 2013, both of which claim a dynamic range of 10^-6 to 10⁶ for floating point numbers, which the standard float16 cannot do (without resorting to subnormal numbers) as it only has five bits for the exponent. In a 2023 court filing, Singular Computing specifically called out Google's use of bfloat16, as that exceeds the dynamic range of float16.^[55] Singular claims non-standard floating point formats were non-obvious in 2009, but Google retorts that the VFLOAT^[56] format, with configurable number of exponent bits, existed as prior art in 2002.^[57] As of January 2024^[update], subsequent lawsuits by Singular have brought the number of patents being litigated up to eight. Towards the end of the trial later that month, Google agreed to a settlement with undisclosed terms.^[58]^[59]

References

External links

Google AI

Computer programs

AlphaGo

Versions	AlphaGo (2015) Master (2016) AlphaGo Zero (2017) AlphaZero (2017) MuZero (2019)
Competitions	Fan Hui (2015) Lee Sedol (2016) Ke Jie (2017)
In popular culture	AlphaGo (2017) The MANIAC (2023)

Other

AlphaFold (2018)
AlphaStar (2019)
AlphaDev (2023)
AlphaGeometry (2024)

Machine learning

Neural networks	WaveNet (2016) Transformer (2017) Gato (2022)
Other	Quantum Artificial Intelligence Lab TensorFlow Tensor Processing Unit

Generative AI

Chatbots	Assistant (2016) Sparrow (2022) Gemini (2023)
Language models	BERT (2018) LaMDA (2021) Chinchilla (2022) PaLM (2022) Gemini (2023) VideoPoet (2024)
Other	Vids (2024)

See also

Differentiable computing

General

Concepts

Applications

Hardware

Software libraries

Implementations

Audio–visual	AlexNet WaveNet Human image synthesis HWR OCR Speech synthesis Speech recognition Facial recognition AlphaFold Text-to-image models DALL-E Midjourney Stable Diffusion Text-to-video models Sora VideoPoet Whisper
Verbal	Word2vec Seq2seq BERT Gemini LaMDA Bard NMT Project Debater IBM Watson IBM Watsonx Granite GPT-1 GPT-2 GPT-3 GPT-4 ChatGPT GPT-J Chinchilla AI PaLM BLOOM LLaMA PanGu-Σ
Decisional	AlphaGo AlphaZero Q-learning SARSA OpenAI Five Self-driving car MuZero Action selection Auto-GPT Robot control

People

Organizations

Architectures

Portals
- Computer programming
- Technology
Categories
- Artificial neural networks
- Machine learning

Digital electronics

Digital electronics
Components	Transistor Resistor Inductor Capacitor Printed electronics Printed circuit board Electronic circuit Flip-flop Memory cell Combinational logic Sequential logic Logic gate Boolean circuit Integrated circuit (IC) Hybrid integrated circuit (HIC) Mixed-signal integrated circuit Three-dimensional integrated circuit (3D IC) Emitter-coupled logic (ECL) Erasable programmable logic device (EPLD) Macrocell array Programmable logic array (PLA) Programmable logic device (PLD) Programmable Array Logic (PAL) Generic Array Logic (GAL) Complex programmable logic device (CPLD) Field-programmable gate array (FPGA) Field-programmable object array (FPOA) Application-specific integrated circuit (ASIC) Tensor Processing Unit (TPU)
Theory	Digital signal Boolean algebra Logic synthesis Logic in computer science Computer architecture Digital signal Digital signal processing Circuit minimization Switching circuit theory Gate equivalent
Design	Logic synthesis Place and route Placement Routing Transaction-level modeling Register-transfer level Hardware description language High-level synthesis Formal equivalence checking Synchronous logic Asynchronous logic Finite-state machine Hierarchical state machine
Applications	Computer hardware Hardware acceleration Digital audio radio Digital photography Digital telephone Digital video cinematography television Electronic literature
Design issues	Metastability Runt pulse

Google

Company

Divisions

Ads
AI
- Brain
- DeepMind
Android
China
- Goojje
Chrome
Cloud
Glass
Google.org
Health
Maps
Pixel
Search
- Timeline
Sidewalk Labs
Sustainability
YouTube
- History
- "Me at the zoo"
- Social impact
- YouTuber

People

Current	Krishna Bharat Vint Cerf Jeff Dean John Doerr Sanjay Ghemawat Al Gore John L. Hennessy Urs Hölzle Salar Kamangar Ray Kurzweil Ann Mather Alan Mulally Rick Osterloh Sundar Pichai (CEO) Ruth Porat (CFO) Rajen Sheth Hal Varian Susan Wojcicki Neal Mohan
Former	Andy Bechtolsheim Sergey Brin (Founder) David Cheriton Matt Cutts David Drummond Alan Eustace Timnit Gebru Omid Kordestani Paul Otellini Larry Page (Founder) Patrick Pichette Eric Schmidt Ram Shriram Amit Singhal Shirley M. Tilghman Rachel Whetstone

Real estate

Design

Fonts
- Croscore
- Noto
- Product Sans
- Roboto
Logo
- Doodle
  - Doodle Champion Island Games
  - Magic Cat Academy
Material Design

Events

Android Developer Challenge Developer Day Developer Lab Code-in Code Jam Developer Day Developers Live Doodle4Google G-Day I/O Jigsaw Living Stories Lunar XPRIZE Mapathon Science Fair Summer of Code Talks at Google
YouTube	Awards CNN/YouTube presidential debates Comedy Week Live Music Awards Space Lab Symphony Orchestra

Projects and
initiatives

20% project
Area 120
- Reply
- Tables
ATAP
Business Groups
Computing University Initiative
Data Liberation Front
Data Transfer Project
Developer Expert
Digital Garage
Digital News Initiative
Digital Unlocked
Dragonfly
Founders' Award
Free Zone
Get Your Business Online
Google for Education
Google for Startups
Labs
Liquid Galaxy
Made with Code
Māori
ML FairnessNative Client
News Lab
Nightingale
OKR
PowerMeter
Privacy Sandbox
Quantum Artificial Intelligence Lab
RechargeIT
Shield
Silicon Initiative
Solve for X
Starline
Student Ambassador Program
Submarine communications cables
- Dunant
- Grace Hopper
Sunroof
YouTube
- Creator Awards
- Next Lab and Audience Development Group
- Original Channel Initiative
Zero

Criticism

2018 data breach 2018 walkouts Alphabet Workers Union Censorship DeGoogle "Did Google Manipulate Search for Hillary?" Dragonfly FairSearch "Ideological Echo Chamber" memo Litigation Privacy concerns Street View San Francisco tech bus protests Services outages Smartphone patent wars Worker organization
YouTube	Back advertisement controversy Censorship Copyright issues Copyright strike Elsagate Fantastic Adventures scandal Headquarters shooting Kohistan video case Reactions to Innocence of Muslims Slovenian government incident

Development

Operating systems

Android
- Automotive
- Glass OS
- Go
- gLinux
- Goobuntu
- Things
- TV
- Wear OS
ChromeOS
- ChromiumOS
- Neverware
Fuchsia
TV

Libraries/
frameworks

Platforms

App Engine AppJet Apps Script Cloud Platform Anvato Firebase Cloud Messaging Crashlytics Global IP Solutions Internet Low Bitrate Codec Internet Speech Audio Codec Gridcentric, Inc. ITA Software Kubernetes LevelDB Neatx Project IDX SageTV
Apigee	Bigtable Bitium Chronicle VirusTotal Compute Engine Connect Dataflow Datastore Kaggle Looker Mandiant Messaging Orbitera Shell Stackdriver Storage

Tools

Search algorithms

Others

BERT BigQuery Chrome Experiments Flutter Gemini Googlebot Keyhole Markup Language LaMDA Open Location Code PaLM Programming languages Caja Carbon Dart Go Sawzall Transformer Viewdle Webdriver Torso Web Server
File formats	AAB APK AV1 On2 Technologies VP3 VP6 VP8 libvpx VP9 WebM WebP WOFF2

Products

Entertainment

Currents (news app) Green Throttle Games Owlchemy Labs Oyster PaperofRecord.com Podcasts Quick, Draw! Santa Tracker Songza Stadia games Typhoon Studios TV Vevo Video
Play	Books Games most downloaded apps Music Newsstand Pass Services
YouTube	BandPage BrandConnect Content ID Instant Kids Music Official channel Preferred Premium original programming YouTube Rewind RightsFlow Shorts Studio TV

Communication

Search

Aardvark
Alerts
Answers
Base
BeatThatQuote.com
Blog Search
Books
- Ngram Viewer
Code Search
Data Commons
Dataset Search
Dictionary
Directory
Fast Flip
Flu Trends
Finance
Goggles
Google.by
Images
- Image Labeler
- Image Swirl
Kaltix
Knowledge Graph
- Freebase
- Metaweb
Like.com
News
- Archive
- Weather
Patents
People Cards
Personalized Search
Public Data Explorer
Questions and Answers
SafeSearch
Scholar
Searchwiki
Shopping
Catalogs
- Express
Squared
Tenor
Travel
- Flights
Trends
- Insights for Search
Voice Search
WDYL

Navigation

Earth
Endoxon
ImageAmerica
Maps
- Latitude
- Map Maker
- Navigation
- Pin
- Street View
  - Coverage
  - Trusted
Waze

Business
and finance

Ad Manager
AdMob
Ads
Adscape
AdSense
Attribution
BebaPay
Checkout
Contributor
DoubleClick
- Affiliate Network
- Invite Media
Marketing Platform
- Analytics
- Looker Studio
- Urchin
Pay (mobile app)
- Wallet
- Pay (payment method)
- Send
- Tez
PostRank
Primer
Softcard
Wildfire Interactive
Widevine

Organization
and productivity

Bookmarks Browser Sync Calendar Cloud Search Desktop Drive Etherpad fflick Files iGoogle Jamboard Notebook One Photos Quickoffice Quick Search Box Surveys Sync Tasks Toolbar
Docs Editors	Docs Drawings Forms Fusion Tables Keep Sheets Slides Sites Vids
Publishing	Apture Blogger Pyra Labs Domains FeedBurner One Pass Page Creator Sites Web Designer

Education

Others

Account Dashboard Takeout Android Auto Android Beam Arts & Culture Assistant Authenticator Body BufferBox Building Maker BumpTop Cast Cloud Print Crowdsource Digital Wellbeing Expeditions Family Link Find My Device Fit Google Fonts Gboard Gemini Gesture Search Impermium Knol Lively Live Transcribe MyTracks Nearby Share Now Offers Opinion Rewards Person Finder Poly Question Hub Quick Share Reader Safe Browsing Sidewiki SlickLogin Sound Amplifier Speech Services Station Store TalkBack Tilt Brush URL Shortener Voice Access Wavii Web Light WiFi
Chrome	Apps Chromium Dinosaur Game GreenBorder Remote Desktop Web Store V8
Images and photography	Camera Lens Snapseed Nik Software Panoramio Photos Picasa Web Albums Picnik

Hardware

Smartphones	Android Dev Phone Android One Nexus Nexus One S Galaxy Nexus 4 5 6 5X 6P Comparison Pixel Pixel 2 3 3a 4 4a 5 5a 6 6a 7 7a Fold 8 8a Comparison Play Edition Project Ara
Laptops and tablets	Chromebook Nexus 7 (2012) 7 (2013) 10 9 Comparison Pixel Chromebook Pixel Pixelbook Pixelbook Go C Slate Tablet
Wearables	Fitbit List of products Pixel Buds Pixel Watch Pixel Watch 2 Project Iris (unreleased) Virtual reality Cardboard Contact Lens Daydream Glass
Others	Chromebit Chromebox Clips Digital media players Chromecast Nexus Player Nexus Q Dropcam Liquid Galaxy Nest Smart Speakers Thermostat Wifi OnHub Pixel Visual Core Search Appliance Sycamore processor Tensor Tensor Processing Unit Titan Security Key

v t e Litigation
Advertising	Feldman v. Google, Inc. (2007) Rescuecom Corp. v. Google Inc. (2009) Goddard v. Google, Inc. (2009) Rosetta Stone Ltd. v. Google, Inc. (2012) Google, Inc. v. American Blind & Wallpaper Factory, Inc. (2017) Jedi Blue
Antitrust	European Union (2010–present) United States v. Adobe Systems, Inc., Apple Inc., Google Inc., Intel Corporation, Intuit, Inc., and Pixar (2011) Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC and Ors. (2019) United States v. Google LLC (2020) United States v. Google LLC (2023)
Intellectual property	Perfect 10, Inc. v. Amazon.com, Inc. and A9.com Inc. and Google Inc. (2007) Viacom International Inc. v. YouTube, Inc. (2010) Lenz v. Universal Music Corp.(2015) Authors Guild, Inc. v. Google, Inc. (2015) Field v. Google, Inc. (2016) Google LLC v. Oracle America, Inc. (2021) Smartphone patent wars
Privacy	Rocky Mountain Bank v. Google, Inc. (2009) Hibnick v. Google, Inc. (2010) United States v. Google Inc. (2012) Judgement of the German Federal Court of Justice on Google's autocomplete function (2013) Joffe v. Google, Inc. (2013) Mosley v SARL Google (2013) Google Spain v AEPD and Mario Costeja González (2014) Frank v. Gaos (2019)
Other	Garcia v. Google, Inc. (2015) Google LLC v Defteros (2020) Epic Games v. Google (2021) Gonzalez v. Google LLC (2022)
Category

Terms and phrases	"Don't be evil" Gayglers Google (verb) Google bombing 2004 U.S. presidential election Google effect Googlefight Google hacking Googleshare Google tax Googlewhack Googlization "Illegal flower tribute" Rooting Search engine manipulation effect Sitelink Site reliability engineering YouTube poop
Documentaries	AlphaGo Google: Behind the Screen Google Maps Road Trip Google and the World Brain The Creepy Line
Books	Google Hacks The Google Story Google Volume One Googled: The End of the World as We Know It How Google Works I'm Feeling Lucky In the Plex The Google Book The MANIAC
Popular culture	Google Feud Google Me (film) "Google Me" (Kim Zolciak song) "Google Me" (Teyana Taylor song) Is Google Making Us Stupid? Proceratium google Matt Nathanson: Live at Google The Billion Dollar Code The Internship Where on Google Earth is Carmen Sandiego?
Others	"Attention Is All You Need" elgooG Predictions of the end Registry .app (top-level domain) .dev g.co .google Pimp My Search Relationship with Wikipedia Sensorvault Stanford Digital Library Project

Italics indicate discontinued products or services.
Category
Commons
Outline
WikiProject