Skip to main content
BuilderAccelerationistTier 3

Noam Shazeer

VP of Engineering, Gemini, Google DeepMind (formerly Character.AI)

Co-author of the Transformer paper that launched the entire modern AI revolution — one of the most consequential technical contributors in the history of artificial intelligence.

Credentials

Co-author of "Attention Is All You Need" (2017), the paper that introduced the Transformer architecture. 20+ years at Google working on core search, ads, and language models. Co-founded Character.AI in 2021, returned to Google in 2024 to lead Gemini development. Earlier work includes foundational contributions to Google's spell-checking and ad systems.

Why They Matter

Shazeer is arguably the most important AI engineer most business people have never heard of. The Transformer — his co-invention — is the architecture behind GPT, Claude, Gemini, and every major AI system today. When he left Google to start Character.AI and then returned in a $2.7 billion deal, it demonstrated just how valuable a single technical mind can be in this era. If you want to understand where AI is headed, watch what Shazeer builds next.

Positions

AI Timeline View

We're still in the early days. Language models will get dramatically better, and the best way to find out what they can do is to build them and let people use them.

Safety Stance

Accelerationist

Key Beliefs

The Transformer architecture is the foundation of modern AI, and scaling it further will continue to yield breakthroughs.

"Attention Is All You Need" (Vaswani, Shazeer et al.)

AI should be widely accessible — letting millions of people interact with AI characters and personalities is the best way to discover what's useful and what's not.

Character.AI founding mission and public statements

The best AI research happens when you ship products and learn from real users, not in isolated academic settings.

Career trajectory — Google search, Character.AI, back to Google Gemini

Mixture of Experts and sparse models are key to making AI more efficient — not everything needs to activate the entire network.

"Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer"

Controversial Take

Shazeer left Google because he felt the company was too cautious about deploying large language models — a frustration shared by several key researchers. His decision to start Character.AI, which let users create and chat with uncensored AI personas, was seen by some as prioritising engagement over safety. The $2.7 billion licensing deal to bring him back to Google was unprecedented and controversial — effectively an acqui-hire without acquiring the company.

Track Record

How well have Noam Shazeer's predictions held up?

The Transformer architecture would replace recurrent and convolutional approaches for sequence modelling

Made: 2017

The Transformer is now the dominant architecture across NLP, computer vision, audio, protein folding, and more. One of the most impactful papers in computer science history.

Right

Conversational AI characters would become a mainstream consumer product

Made: 2022

Character.AI reached tens of millions of users, with remarkably high engagement (users spend more time per session than on most social media apps).

Right

Mixture of Experts would become essential for scaling models efficiently

Made: 2017

GPT-4, Gemini, and Mixtral all use MoE architectures. Shazeer's 2017 MoE paper was ahead of its time.

Right

Key Quotes

Language models are going to change everything. I've believed this for 20 years and I still think we're early.

[SOURCE NEEDED]

The Transformer was really about simplifying things. We took out recurrence, took out convolutions, and just used attention. Simpler is better.

[SOURCE NEEDED]

I want to build AI that a billion people talk to every day. That's the mission.

Character.AI founding period, various interviews (2022)

Google had the technology to build ChatGPT years before OpenAI did. They just didn't ship it.

[SOURCE NEEDED]

Last updated: 2026-04-12

Back to AI Minds Directory