Eliezer Yudkowsky


Eliezer Shlomo Yudkowsky is an American artificial intelligence researcher and writer best known for popularizing the idea of friendly artificial intelligence. He is a co-founder and research fellow at the Machine Intelligence Research Institute, a private research nonprofit based in Berkeley, California. His work on the prospect of a runaway intelligence explosion was an influence on Nick Bostrom's . An autodidact, Yudkowsky did not attend high school or college.

Work in artificial intelligence safety

Goal learning and incentives in software systems

Yudkowsky's views on the safety challenges posed by future generations of AI systems are discussed in the undergraduate textbook in AI, Stuart Russell and Peter Norvig's . Noting the difficulty of formally specifying general-purpose goals by hand, Russell and Norvig cite Yudkowsky's proposal that autonomous and adaptive systems be designed to learn correct behavior over time:
In response to the instrumental convergence concern, where autonomous decision-making systems with poorly designed goals would have default incentives to mistreat humans, Yudkowsky and other MIRI researchers have recommended that work be done to specify software agents that converge on safe default behaviors even when their goals are misspecified.

Capabilities forecasting

In the intelligence explosion scenario hypothesized by I. J. Good, recursively self-improving AI systems quickly transition from subhuman general intelligence to superintelligent. Nick Bostrom's 2014 book sketches out Good's argument in detail, while citing writing by Yudkowsky on the risk that anthropomorphizing advanced AI systems will cause people to misunderstand the nature of an intelligence explosion. "AI might make an apparently sharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of 'village idiot' and 'Einstein' as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds-in-general."
In on artificial intelligence, Stuart Russell and Peter Norvig raise the objection that there are known limits to intelligent problem-solving from computational complexity theory; if there are strong limits on how efficiently algorithms can solve various computer science tasks, then intelligence explosion may not be possible.

Rationality writing

Between 2006 and 2009, Yudkowsky and Robin Hanson were the principal contributors to Overcoming Bias, a cognitive and social science blog sponsored by the Future of Humanity Institute of Oxford University. In February 2009, Yudkowsky founded LessWrong, a "community blog devoted to refining the art of human rationality". Overcoming Bias has since functioned as Hanson's personal blog.
Over 300 blogposts by Yudkowsky on philosophy and science were released as an ebook entitled Rationality: From AI to Zombies by the Machine Intelligence Research Institute in 2015. MIRI has also published Inadequate Equilibria, Yudkowsky's 2017 ebook on the subject of societal inefficiencies.
Yudkowsky has also written several works of fiction. His fanfiction novel, Harry Potter and the Methods of Rationality, uses plot elements from J.K. Rowling's Harry Potter series to illustrate topics in science. The New Yorker described Harry Potter and the Methods of Rationality as a retelling of Rowling's original "in an attempt to explain Harry's wizardry through the scientific method".

Academic publications

*