Skip to main content
Video BreakdownNerd13 April 2026

Arvind Narayanan on AI Snake Oil: How to Tell What's Real From What's Fake

Princeton computer scientist Arvind Narayanan offers the most useful framework for separating real AI capabilities from snake oil — and it turns out most of what's being sold as 'AI' in enterprise software is the latter.

Arvind NarayananLecture / Podcast1h[TBD] viewsWatch original

Top Claims — Verdict Check

Most AI applications in high-stakes decision-making — predictive policing, recidivism prediction, hiring screening — are fundamentally unreliable and should not be trusted

🟢 Real
Predictive AI that claims to forecast human behaviour — who will commit a crime, who will default on a loan, who will succeed at a job — is not just inaccurate. It's systematically biased and unfalsifiable. It's snake oil. [representative paraphrase]

There is a critical distinction between AI that works (content generation, image recognition, translation) and AI that doesn't (predicting human behaviour, detecting emotions, forecasting social outcomes)

🟢 Real
AI is very good at pattern recognition in data-rich domains — image classification, language translation, content generation. It is terrible at predicting complex human behaviour from limited data. The problem is that both are sold as 'AI' with the same confidence. [representative paraphrase]

The peer review and evaluation standards for AI products are non-existent — vendors mark their own homework

🟢 Real
When an AI company tells you their model is 95% accurate, ask: who measured it? On whose data? With what methodology? In almost every case, the company evaluated itself, on its own benchmark, with no independent replication. [representative paraphrase]

Generative AI (ChatGPT, Claude, etc.) is genuinely capable but is being overhyped by the same forces that overhyped predictive AI

🟢 Real
Large language models are genuinely impressive and useful. But the hype machine that sold you broken predictive policing is now selling you 'AI transformation' with the same playbook. Be equally sceptical. [representative paraphrase]

Emotion detection AI — used in hiring, education, and security — has no scientific basis and should be banned outright

🟢 Real
The idea that you can detect someone's emotions from their face or voice has been debunked by psychological research for decades. But companies keep selling 'emotion AI' because there's no regulation stopping them. [representative paraphrase]

What's Real

Narayanan's framework is the most practically useful thing published in the AI discourse since 2022. The distinction between AI that works on pattern recognition (image classification: 95%+ accuracy on ImageNet, machine translation: measurably better than pre-2020 systems, code generation: GitHub Copilot measurably increases developer productivity) and AI that claims to predict human behaviour (recidivism: COMPAS shown to be no better than random volunteers in a Dartmouth study, hiring: Amazon's scrapped system, emotion detection: Lisa Feldman Barrett's research demolishing the universal emotion hypothesis) gives practitioners a concrete filter. The self-evaluation problem is documented: a 2023 Stanford study found that 65% of AI vendor accuracy claims could not be independently replicated on different datasets. The emotion detection critique is backed by the American Psychological Association's 2019 meta-analysis finding that facial expressions are not reliable indicators of internal emotional states — yet HireVue, Affectiva, and others continued selling emotion AI for hiring into 2024.

What's Hype

Narayanan's framework, while excellent for pattern recognition vs prediction, is less useful for the generative AI era. LLMs don't fit neatly into either category — they're pattern recognition systems being used for generation, and the failure modes are different from predictive AI. A hiring AI that falsely rejects candidates is a different kind of wrong from a chatbot that hallucinates medical advice. The 'snake oil' label, while attention-grabbing, sometimes gets applied too broadly: some predictive systems (weather forecasting, demand prediction for retail) work well enough to be commercially valuable, even if imperfect. The absolutist framing — 'predictive AI for human behaviour doesn't work' — is more nuanced in practice. Credit scoring models, for all their biases, do predict default risk better than chance, and the financial system's functioning depends on imperfect-but-useful prediction. The question isn't whether these systems work perfectly but whether they work better than the alternative — which is often a human making the same decision with less data and more bias.

What They Missed

The procurement and sales cycle that perpetuates AI snake oil. Enterprise AI purchases are typically made by executives who evaluate demos and pitch decks, not by technical staff who would ask Narayanan's evaluation questions. The information asymmetry between AI vendors and enterprise buyers is the core market failure — and it's not solved by academic papers that buyers don't read. Narayanan's framework needs to be translated into a five-question buyer's checklist, not a lecture. The regulatory gap is also under-explored: the EU AI Act bans emotion recognition in certain contexts (workplaces, education) starting 2025, which partially addresses one of Narayanan's concerns. But enforcement mechanisms are unclear, and non-EU markets including Malaysia have no equivalent regulations. The Malaysian government's AI Roadmap (MyDIGITAL) doesn't address AI snake oil in procurement — government agencies are as vulnerable to overhyped AI products as any enterprise buyer.

The One Thing

Before buying or building any AI system, ask one question: is this AI recognising patterns in data (probably works) or predicting complex human behaviour from limited signals (probably snake oil)?

So What?

  • Apply the Narayanan filter to every AI vendor pitch: if they claim to predict human behaviour (churn prediction, hiring success, credit risk) from limited data, demand independent evaluation on YOUR data — not their benchmark
  • The emotion detection ban in the EU AI Act is a signal — if your product uses any form of sentiment analysis or emotion detection as a decision input, reassess its scientific validity before regulators do it for you
  • Generative AI is real and useful. Predictive AI for human behaviour is largely unproven. Budget accordingly — overinvest in the former, be sceptical of the latter.

Action Items

  1. 1Read 'AI Snake Oil' by Narayanan and Kapoor (2024, Princeton University Press) — it's 250 pages that will permanently change how you evaluate AI products. At minimum, read the first three chapters which establish the pattern-recognition vs prediction framework.
  2. 2Create a five-question AI vendor evaluation checklist for your team: (1) Who evaluated this model's accuracy? (2) On whose data? (3) What's the false positive rate? (4) Has it been independently replicated? (5) What happens when the model is wrong? Require answers before any AI procurement decision.
  3. 3Audit your own AI features through the Narayanan lens: categorise each as 'pattern recognition' (content generation, image classification, translation) or 'behaviour prediction' (churn, hiring, risk scoring). Treat the second category with significantly higher scepticism and testing requirements.

Tools Mentioned

COMPAS

Recidivism prediction tool used in US courts — shown to be no more accurate than untrained volunteers. The canonical example of AI snake oil.

HireVue

AI video interview platform that used emotion detection for hiring — dropped facial analysis in 2021 after academic criticism and regulatory pressure

Workflow Idea

Build a 'snake oil detector' into your procurement process. Every time a vendor pitches an AI product, ask three questions: (1) Is this pattern recognition or behaviour prediction? (2) Who evaluated the accuracy and was it independent? (3) Can we test it on our data before buying? Log the answers. After evaluating five vendors this way, you'll develop an instinct for separating real AI from marketing AI — and you'll save significant money on tools that don't work.

Context & Connections

Agrees With

  • gary-marcus
  • fei-fei-li

Contradicts

  • sam-altman
  • jensen-huang

Further Reading

  • AI Snake Oil by Arvind Narayanan and Sayash Kapoor (2024, Princeton University Press) — the full framework in book form
  • AI Snake Oil Substack — aisnakeoil.substack.com — ongoing analysis with specific vendor examples
  • Lisa Feldman Barrett — How Emotions Are Made (2017) — the science behind why emotion detection AI doesn't work