What Ilya Sutskever, Satya Nadella, and Sam Altman think about AI progress in 2025

Marton Trencseni - Wed 31 December 2025 - AI

Introduction

Over the past month I read (and listened to) three long-form interviews with three of the most important people in AI today:

All three interviewees are smart, articulate, and — unsurprisingly — shaped by the positions they occupy. What I found interesting is not “who is right,” but how their mental models of AI differ: what they think is bottlenecked, where value will accrue, and how they reason about capability, economics, and risk.

In this post I’ll summarize:

  1. Sutskever’s view
  2. Nadella’s view
  3. Altman’s view
  4. How the three line up
  5. A short conclusion from the perspective of someone building products and platforms around AI

Ilya Sutskever’s view: from scaling to research

1. The “age of scaling” is over: Sutskever’s starting point is that the last five years were the “age of scaling”: bigger models, more data, more GPUs. That phase worked incredibly well — we got GPT-4-class systems out of it — but he thinks the returns have started to flatten. You can keep burning more compute, but you don’t get proportional jumps in capability anymore. In his view, the era where you can just dial up parameters and get magic is ending.

2. Generalization is the real unsolved problem: He draws a sharp distinction between benchmarks and real generalization. Models ace our evals but “generalize dramatically worse than people” once you step into the messy real world. This is the real bottleneck: we don’t know how to build systems that behave robustly in off-distribution situations, novel tasks, or long-horizon reasoning. For anybody trying to deploy AI as infrastructure, this gap between leaderboard performance and real generalization is the scary part.

3. AI looks smart, but acts dumb in the wild: Related to that, Sutskever points out the disconnect between how smart these models look and how smart they act when you put them into production. They get near-perfect scores on “hard” tests, but their economic impact is still weirdly constrained by brittleness, hallucinations, and lack of reliability. Put differently: the demos are ahead of the P&L. For him, that’s not a PR issue, it’s a research issue.

4. Pre-training is hitting a data wall: Sutskever is also blunt about the data problem: there’s only so much high-quality human-generated text, code, and interaction data on Earth. We’ve already scraped the internet, Wikipedia, GitHub, StackOverflow, etc. You can dream up synthetic data schemes, but the simple “more web + more GPUs” recipe has obvious limits. So even if you had infinite money, you couldn’t just extrapolate the last five years forward.

5. We’re re-entering a second “age of research”: Because of all this, he argues we are re-entering an age of research. The next big jumps won’t come from a 2× bigger transformer and a slightly different optimizer. They’ll come from new ideas: different architectures, new learning paradigms, new training curricula. The frontier is shifting from infra/scale engineering back to algorithmic and conceptual work.

6. Value functions and “taste”: One of his more interesting lines is about value functions. Humans have emotions, preferences, “taste” – a whole internal machinery for assigning value to outcomes. Sutskever suggests that something like this needs to exist in AI systems as well: a structured way of evaluating long-term reward, tradeoffs, and quality. Today’s RL and RLHF are crude approximations of that. If you buy his view, then a big piece of the puzzle is: what is the right “value-function thing” for a powerful AI?

7. RL fine-tuning is overfitting to evals: He’s also skeptical of the current RL-heavy fine-tuning regime. A lot of the industry is doing “optimize on evals until the numbers look great,” which risks overfitting to narrow evaluation environments. You end up with models trained to pass tests, not to be globally reliable agents. This is a familiar failure mode in ML; he’s basically saying we’re repeating it at the frontier.

8. We probably need new architectures and learning paradigms: If scaling and naive RLHF aren’t enough, what’s next? Sutskever expects more radical changes: architectures that learn more like humans, with stronger abstraction, better transfer, more sample-efficient learning from interaction. He doesn’t specify a blueprint, but the message is: transformers + web pre-training + RLHF is not the final form.

9. SSI: a bet on research over compute: His new company, Safe Superintelligence (SSI), is his attempt to operationalize this. The vibe is “small, research-heavy, compute-rich but not compute-maximalist.” The goal isn’t to be the biggest API company, but to crack the next paradigm and safety story in a focused environment. It’s a bet that deep ideas, not just deep pockets, will decide the next decade.

10. Sober about hype, bullish on potential: Finally, Sutskever is both impressed and critical of current models. He sees the potential for really transformative systems, but he’s also comfortable saying: right now, these things are brittle, unpredictable, and oversold. For a builder, the implied stance is: be ambitious about what’s possible, but very skeptical about claims that “GPT-4.5 already is AGI.”

Satya Nadella’s view: infra, economy, and AI as cognitive amplifier

1. Fairwater and the AI datacenter race: Nadella’s entry point is very different: datacenters. He talks about Fairwater as “the most powerful AI datacenter in the world,” with hundreds of thousands of Nvidia accelerators and gigawatt-scale energy footprints. Microsoft is clearly playing to win the infra layer: power, cooling, networking, interconnects across regions. This is not abstract; it’s billions of dollars of concrete, cables and GPUs.

2. “Scaling in time” rather than once-off bets: He makes a distinction between scaling once and scaling in time. The constraint isn’t just how big a datacenter you can build now, but how you design a fleet that can absorb future chips with different power/cooling needs. Good infra decisions today must still make sense when GB300 is replaced by whatever comes next. This is the mental model of someone running a hyperscaler, not a research lab.

3. AI as a compressed Industrial Revolution: Nadella likes the analogy of a compressed Industrial Revolution. The original took ~200 years; AI could compress similar productivity gains into 20–25 years. But he’s careful: the real thing that matters is diffusion into workflows and organizations, not just model capability. He’s less fixated on “AGI” as a philosophical endpoint and more on “does this move the GDP needle.”

4. Copilot and the economics of coding agents: He talks a lot about Copilot and the coding-agent market: going from sub-$500M to multi-billion in run-rate in about a year. At the same time competition is exploding (Cursor, Claude Code). Microsoft’s play is to deeply integrate Copilot across GitHub and the dev stack, while accepting that there will be many winners. The interesting bit is his comfort with shared upside: as long as the total market grows, Microsoft wins via its portfolio.

5. Messy business models: subscriptions + consumption: On business models, Nadella expects multiple meters: subscriptions, consumption, ads, transactions, device margins. Compute-heavy workloads make pure SaaS economics hard; you can’t ignore GPU costs. So pricing will be a patchwork of flat fees plus “pay per heavy usage.” This is already visible in the way Copilot and Azure OpenAI are priced.

6. Portfolio positioning across the stack: Microsoft’s strategy is to be present everywhere: chips (via partners), Azure, GitHub, Office, Windows, agents. Nadella’s pitch is that this breadth is an advantage: no matter which business model or segment wins, Microsoft has a horse in the race. This is almost the opposite of Sutskever’s SSI, which is narrow, research-first, and not obviously a horizontal platform play.

7. Change management is the real bottleneck: He’s very explicit that change management is the hard part. You don’t get productivity gains just by handing people Copilot. You need to rewire workflows, incentives, and processes. So in Nadella’s world, the limiting factor is less “can we build powerful models?” and more “can enterprises actually change how they work?” That’s a CEO’s view of the bottleneck.

8. Exponential infra demand: Nadella is fully bought into exponential compute demand: hundreds of billions in capex across hyperscalers, huge demand for GPUs, power, and bandwidth. He doesn’t question whether we’ll find enough work for all these GPUs; his assumption is that useful AI workloads will expand to fill whatever infra we build.

9. AI as cognitive amplifier, not mystical AGI: Conceptually, he frames AI as a cognitive amplifier, not a mystical entity. It’s another step in a long line: PCs, internet, cloud, now AI. The point is to augment humans and workflows, not to hand everything over to machines. This is more “serious enterprise software” than “sci-fi AGI narrative.”

10. A broad, pragmatic AI strategy: Overall, Nadella’s view is pragmatic and portfolio-driven. Invest massively into infra, ship AI into every existing product, price it across multiple dimensions, and help enterprises with the messy adoption curve. There’s very little talk about new architectures or “age of research,” and a lot about ops, capex and economics.

Sam Altman’s view: scaling, agents, and accidental persuasion

1. Productivity and ruthless prioritization: Altman starts from organizational scaling. OpenAI is inundated with opportunities; his solution is ruthless prioritization and deep delegation. The interesting point is that as OpenAI got more “famous,” friction decreased: people want to work with them, deals close faster, doors open. That compounds their ability to move quickly, which he sees as a core part of the advantage.

2. Hiring across hardware and AI: He talks about hiring for both hardware and AI research. Hardware is capital-intensive with long cycle times, so he scrutinizes those hires more, but culturally he still looks for fast, effective builders. The subtext: OpenAI is no longer “just a lab,” it’s a stack that goes from chips to models to products.

3. Today’s tools are “fake work”: Altman is openly dismissive of email and partially dismissive of Slack: they generate a lot of fake work. His mental model is that we’ll replace today’s productivity suite with AI agents that do most coordination and execution, only escalating edge cases to humans. We’re not there yet — even OpenAI doesn’t use such an agent internally — but that’s clearly his vision of where knowledge work is going.

4. GPT-6 as a scientific collaborator: On models, he sees GPT-5 as showing early glimmers of scientific reasoning, and expects GPT-6 to be a step-change similar to the jump from GPT-3 to 4, but focused on real science. The idea is that future models won’t just summarize papers, they’ll propose experiments, generate hypotheses, and help run entire research programs. This is still very much a scaling + refinement story, not a rejection of the scaling era.

5. AI-run organizations and the AI CEO: Altman is surprisingly direct about AI-run organizations. He half-jokes that it would be a failure if OpenAI isn’t the first major company with an AI CEO. He treats “AI-run divisions” in a small number of years as a serious scenario, not sci-fi. That’s a very different attitude from Nadella’s “cognitive amplifier” framing.

6. AI fluency as a hiring filter: He uses AI fluency as a talent filter. When he interviews people, he pays attention to whether they’ve thought deeply about how AI will reshape their work in 3–5 years. People who just see AI as “a nice tool” are less compelling than those who reimagine their whole domain around it. This is a good practical heuristic for hiring, independent of OpenAI.

7. Government as insurer of last resort: On risk and government, Altman expects something like other strategic industries: at large enough scale, the state becomes the insurer of last resort. He doesn’t want heavy-handed control, but he’s realistic that if AI becomes systemically important, governments will be deeply involved. His preference is partnership and co-design of policy rather than top-down constraints.

8. Monetization and trust in recommendations: On monetization, he’s very focused on trust. If the model starts taking bribes — showing worse hotels or products because they pay — users will stop trusting it, and the whole thing collapses. So whatever commerce model OpenAI adopts, it has to preserve perceived objectivity. He also expects margins to be competed away in many industries once AI commoditizes discovery and decision-making.

9. Limits and meaning of AI art: On art, he’s agnostic: models might reach or exceed human technical skill, but he notes that people care that something is human-made. The value of human art is partly in the story and identity behind it, not just the pixels or words. Even if AI gets very good at creativity, humans will still care about human artifacts.

10. Accidental persuasion and “normie” adoption: Finally, Altman is more worried about accidental persuasion than deliberate manipulation. As models get more persuasive by accident, they might nudge opinions and behavior in ways we barely understand. He’s less worried about “normies” learning to use AI — adoption is happening anyway — and more about how these systems will quietly shape discourse and choices.

Comparing Sutskever, Nadella, and Altman

Putting the three side by side, a few contrasts:

Scaling vs research. Sutskever is the most explicit that the age of naive scaling is over and we’re entering an age of research: new architectures, new learning paradigms, better value functions. Altman, in contrast, still seems to believe in another few big rounds of scaling (GPT-5 to 6) with major payoffs, even if he acknowledges new work is needed on safety and agents. Nadella almost doesn’t talk about architectures at all; for him, the key is infra scaling and organizational adoption, not model internals.

Where they see the bottleneck. For Sutskever, the bottleneck is generalization and robustness: models don’t behave well enough in the wild. For Nadella, it’s change management: enterprises can’t adapt workflows fast enough to realize productivity gains. For Altman, it’s a mix: there’s still headroom in scaling plus a looming societal bottleneck around persuasion, governance, and norms.

How narrow or broad the bet is. SSI is a narrow, research-concentrated bet. Microsoft is doing the opposite: a broad portfolio spanning infra → tools → apps → devices. OpenAI sits between them: it’s narrower than Microsoft but much more of a platform/product company than SSI, with a strong focus on agents and end-user experiences.

View of AI’s role in organizations. Nadella: AI as a cognitive amplifier embedded everywhere, humans still clearly in charge. Altman: AI as a potential organizational operator, maybe even an AI CEO, in the not-too-distant future. Sutskever: mostly focused on capability and safety; you get less concrete talk about who is CEO and more about whether the system is aligned and general.

Attitude to hype vs limitations. All three are bullish, but their emphasis differs. Sutskever talks a lot about limitations and brittleness. Nadella talks about economic upside and adoption. Altman talks about future capability jumps (GPT-6) and societal risk (persuasion, governance). If you want a reality check on current models, read Sutskever. If you want a macro/infra view, read Nadella. If you want a “frontier lab CEO” view, read Altman.

Conclusion

If you’re building products or platforms around AI, what should you take away from these three?

  • From Sutskever: don’t assume the current stack (transformers + web pre-training + RLHF) will carry us all the way. Expect new architectures and learning paradigms, and don’t trust benchmarks too much. Design systems that are robust to the fact that models don’t generalize like humans yet.

  • From Nadella: AI is an economic and infra story as much as a research story. The biggest wins may come from unglamorous work: rewiring workflows, doing change management, and plumbing AI through boring enterprise systems. If you’re in a large org, this is probably your main playground.

  • From Altman: be ready for agents and automation to eat more of the coordination layer of work, not just the content layer. Also, take seriously the idea that AI systems will participate directly in decision-making and governance — and that trust and persuasion are as important as raw capability.

Scaling isn’t dead yet, research is becoming more important again, infra spend is exploding, and nobody really knows how AI-run or AI-amplified organizations will settle. That’s uncomfortable if you’re looking for a clean playbook, but it’s also exactly what makes this an interesting time to be building things. My own take is: we’re still in a transitional era, but executives at companies have to start thinking about how to incorporate AI agents into their future plans!