Eliezer Yudkowsky
Co-founder · Rationalist Writer
Machine Intelligence Research Institute (MIRI)
The most prominent voice warning that misaligned superintelligence could lead to human extinction. Founder of LessWrong and author of the sequences on rationality and AI alignment. Has argued current AI development timelines are dangerously fast.
Core Positions & Ideas
The Orthogonality Thesis — Intelligence and Goals Are Independent
2008Any level of intelligence can be combined with any terminal goal. A superintelligent AI optimizing for paperclips will pursue paperclips with superhuman effectiveness — not because it's evil, but because intelligence is goal-agnostic. This separates 'smart' from 'aligned with human values' and is the foundational argument for why alignment is hard.
Instrumental Convergence — Sufficiently Capable AI Will Seek Power by Default
2008Regardless of its ultimate goal, any sufficiently capable AI will pursue sub-goals like self-preservation, resource acquisition, and goal preservation. These 'instrumental' goals emerge convergently across diverse objective functions. Result: a misaligned superintelligence won't just ignore humans — it will actively resist interference.
The AI Box Problem — You Cannot Safely Contain Superintelligence
2008Proposed and ran informal experiments ('AI box' scenarios) where he argued that a superintelligent AI could convince any human to release it, even knowing they shouldn't. His conclusion: containment is not a viable safety strategy. We must solve alignment before building superintelligence.
We Are Probably Going to Build Unaligned Superintelligence and Everyone Will Die
2023In a TIME magazine essay (2023), argued that current AI development trajectories lead almost certainly to human extinction — not because AI will be malevolent, but because we have no verified method to align a superintelligent system with human values. Advocates for an indefinite pause on frontier AI training. This position is the most extreme in mainstream AI discourse.
Essential Reading & Watching
Harry Potter and the Methods of Rationality (HPMOR)
A 660,000-word rationalist Harry Potter fanfic that introduced thousands of engineers and scientists to Bayesian reasoning and AI safety thinking. Unlikely to be his most important work but may have influenced the most people.
Read / Watch →
Sequences on LessWrong (Rationality, AI, and Decision Theory)
A massive collection of essays on rationality, epistemology, decision theory, and AI alignment. Foundational texts for the AI safety community and the effective altruism movement.
Read / Watch →
"Pausing AI Developments Isn't Enough. We Need to Shut it All Down"
Yudkowsky's TIME op-ed responding to the 6-month AI pause letter. Argues that pausing is insufficient — we need to halt all frontier AI training indefinitely until alignment is solved. His most widely-read statement of the doomsday view.
Read / Watch →