Skip to main content
Video BreakdownNerd13 April 2026

George Hotz on tinygrad, comma.ai, and Building AI from Scratch

The hacker who jailbroke the iPhone and built a self-driving car in his garage argues that AI infrastructure is too bloated, too expensive, and too corporate — and he's building the alternative with 7,000 lines of code.

George HotzLex Fridman Podcast3h 37m[TBD] viewsWatch original

Top Claims — Verdict Check

tinygrad can match PyTorch and TensorFlow performance with 1% of the code complexity

🟡 Partially True
PyTorch is 2 million lines of code. tinygrad is under 7,000. We're not trying to do everything — we're trying to do the important things fast, on real hardware, without the bloat. [representative paraphrase]

NVIDIA's dominance in AI compute is artificial and maintainable only through CUDA lock-in, not hardware superiority

🟢 Real
NVIDIA's moat is CUDA, not silicon. If you break the CUDA dependency — which tinygrad does — suddenly AMD, Intel, and even custom chips become viable. The hardware competition that should exist has been suppressed by a software monopoly. [representative paraphrase]

comma.ai's openpilot is the best open-source self-driving system and a proof that small teams can outperform billion-dollar programs

🟡 Partially True
We ship autonomous driving updates every two weeks to real cars on real roads. Waymo spent $5 billion and still only operates in a few cities with HD maps. We work everywhere, with a camera and a $2,000 device. [representative paraphrase]

The AI industry is building cathedrals when it should be building bazaars — small, fast, hackable systems will win

🟡 Partially True
Everyone wants to build the $100 billion AI data center. I want to build the thing you can run on your laptop. The future of AI isn't bigger — it's smaller, faster, more accessible. [representative paraphrase]

Most AI safety concerns are corporate fear-mongering designed to create regulatory barriers to entry

🟡 Partially True
The big labs want regulation because regulation requires compliance departments, and compliance departments require being a big company. It's a moat disguised as safety. [representative paraphrase]

What's Real

The CUDA lock-in thesis is well-documented and increasingly acknowledged by the industry. NVIDIA's software ecosystem — CUDA, cuDNN, TensorRT — represents 15+ years of developer investment that makes switching to AMD or Intel GPUs a rewrite, not a recompile. This is why NVIDIA commands 90%+ margins on AI accelerators despite AMD offering competitive hardware specs with MI300X. Hotz's insight that the bottleneck is software, not silicon, is validated by the fact that every major competitor (AMD ROCm, Intel oneAPI, Google JAX) has invested billions trying to create CUDA alternatives with limited adoption success. tinygrad's approach — a minimal, hardware-agnostic tensor library — has attracted genuine interest from researchers and startups precisely because it side-steps the CUDA dependency. The project's GitHub stars crossed 25,000+ by 2024, and its lazy evaluation and JIT compilation approach produces competitive performance on supported hardware. comma.ai's commercial traction is real: over 10,000 comma devices sold, working across 250+ car models, with an active developer community contributing to openpilot. The $2,000 price point versus Tesla's $15,000 FSD package is a genuine disruption.

What's Hype

The '1% of the code, same performance' claim requires significant asterisks. tinygrad handles the core tensor operations well but lacks the ecosystem that makes PyTorch dominant: distributed training across thousands of GPUs, production serving infrastructure, mobile deployment, quantization toolchains, and the integration with hundreds of libraries for computer vision, NLP, audio, and scientific computing. For a research lab training a 70B parameter model across 1,000 H100s, tinygrad is not a viable alternative today. The self-driving comparison is even more loaded: openpilot is a Level 2 driver assistance system (hands must stay on wheel, human must supervise). Waymo is a Level 4 autonomous system (no human needed). Comparing their price points without comparing their capability levels is like comparing a bicycle's price to a car's and declaring the bicycle won. The AI safety critique — that it's all corporate moat-building — has some truth but paints with too broad a brush. AI systems making medical diagnoses, controlling infrastructure, or influencing elections present genuine risks that exist independently of whether Google benefits from regulation.

What They Missed

The economic model for DIY AI infrastructure doesn't scale to the problems that matter most for business adoption. A Malaysian SME doesn't need to break free from NVIDIA's CUDA lock-in — they need a working chatbot for their customer service team. The gap between 'technically possible on minimal hardware' and 'practically deployable by a non-engineer' is where Hotz's vision breaks down for most businesses. tinygrad is a brilliant systems programming achievement, but its audience is systems programmers — perhaps 100,000 people globally. The energy and compute efficiency angle is also missing from Hotz's framing: smaller, more efficient AI systems aren't just a philosophical preference, they're an economic necessity for deployment in developing markets where cloud compute costs consume a larger share of revenue. This aligns with Hotz's thesis but from a demand-side argument he doesn't make. The regulatory landscape in ASEAN — where Malaysia's AI governance framework is still developing — means the 'regulation as moat' debate has different stakes here than in the US or EU.

The One Thing

NVIDIA's real moat is CUDA software lock-in, not hardware superiority — and every tool that breaks that dependency (tinygrad, AMD ROCm, Apple MLX) makes AI cheaper and more accessible for everyone.

So What?

  • You don't need NVIDIA's most expensive GPUs to run useful AI — explore Apple Silicon (Mac Studio with M2 Ultra), AMD alternatives, and cloud providers like Lambda Labs that offer cheaper GPU access for inference and fine-tuning workloads
  • The 'small and hackable beats big and corporate' philosophy applies directly to your AI adoption strategy: start with the simplest possible implementation, prove value, then scale — don't start with enterprise AI platforms you don't need yet
  • comma.ai's model proves that focused, opinionated products can compete with billion-dollar incumbents — if you're building an AI product for the Malaysian market, being smaller and more focused is an advantage, not a limitation

Action Items

  1. 1If you're paying for cloud GPU instances, benchmark your actual workload on an Apple Mac Mini M4 Pro (RM 6,500). For inference on models up to 13B parameters, Apple Silicon's unified memory architecture delivers surprisingly competitive performance at a fraction of the cloud cost. Running local eliminates per-query API fees entirely.
  2. 2Read tinygrad's README and core source code (github.com/tinygrad/tinygrad) — even if you never use it, the design philosophy of 'minimal code, maximum hardware coverage' is an engineering lesson applicable to any software project. The entire codebase is readable in an afternoon.
  3. 3Audit your current AI compute costs and check whether you're paying an 'NVIDIA tax' — compare pricing for the same workload on NVIDIA A100s vs AMD MI250 vs Google TPUs on the major cloud providers. The price differences can be 30-50% for equivalent performance on many workloads.

Tools Mentioned

tinygrad

Minimalist tensor framework in under 7,000 lines of code — hardware-agnostic alternative to PyTorch

comma.ai / openpilot

Open-source driver assistance system — $2,000 device that works on 250+ car models

CUDA

NVIDIA's parallel computing platform — the software lock-in that maintains their AI compute monopoly

Apple MLX

Apple's ML framework for Apple Silicon — another approach to breaking CUDA dependency for inference workloads

Workflow Idea

Build a 'compute cost audit' for your AI workloads. List every AI task your business runs (chatbot, document processing, image generation, etc.), note the current provider and cost per unit, then research three alternatives for each: a cheaper cloud option, an open-source model on cheaper hardware, and a local deployment option. Create a 3-column comparison with monthly costs. Most businesses discover they can cut AI compute costs by 40-60% without meaningful quality loss by moving non-critical tasks off frontier API providers and onto smaller, cheaper alternatives. The audit takes half a day; the savings compound every month.

Context & Connections

Agrees With

  • clem-delangue
  • yann-lecun

Contradicts

  • jensen-huang
  • dario-amodei

Further Reading

  • tinygrad GitHub repository (github.com/tinygrad/tinygrad) — the most readable ML framework source code available
  • 'The CUDA Moat' by Dylan Patel (SemiAnalysis) — detailed analysis of NVIDIA's software ecosystem advantage