It’s a great story—it just might not be true. Sutskever insists he bought those first GPUs online. But such myth-making is commonplace in this buzzy business. Sutskever himself is more humble: “I thought, like, if I could make even an ounce of real progress, I would consider that a success,” he says. “The real-world impact felt so far away because computers were so puny back then.”
After the success of AlexNet, Google came knocking. It acquired Hinton’s spin-off company DNNresearch and hired Sutskever. At Google Sutskever showed that deep learning’s powers of pattern recognition could be applied to sequences of data, such as words and sentences, as well as images. “Ilya has always been interested in language,” says Sutskever’s former colleague Jeff Dean, who is now Google’s chief scientist: “We’ve had great discussions over the years. Ilya has a strong intuitive sense about where things might go.”
But Sutskever didn’t remain at Google for long. In 2014, he was recruited to become a cofounder of OpenAI. Backed by $1 billion (from Altman, Elon Musk, Peter Thiel, Microsoft, Y Combinator, and others) plus a massive dose of Silicon Valley swagger, the new company set its sights from the start on developing AGI, a prospect that few took seriously at the time.
With Sutskever on board, the brains behind the bucks, the swagger was understandable. Up until then, he had been on a roll, getting more and more out of neural networks. His reputation preceded him, making him a major catch, says Dalton Caldwell, managing director of investments at Y Combinator.
“I remember Sam [Altman] referring to Ilya as one of the most respected researchers in the world,” says Caldwell. “He thought that Ilya would be able to attract a lot of top AI talent. He even mentioned that Yoshua Bengio, one of the world’s top AI experts, believed that it would be unlikely to find a better candidate than Ilya to be OpenAI’s lead scientist.”
And yet at first OpenAI floundered. “There was a period of time when we were starting OpenAI when I wasn’t exactly sure how the progress would continue,” says Sutskever. “But I had one very explicit belief, which is: one doesn’t bet against deep learning. Somehow, every time you run into an obstacle, within six months or a year researchers find a way around it.”
His faith paid off. The first of OpenAI’s GPT large language models (the name stands for “generative pretrained transformer”) appeared in 2016. Then came GPT-2 and GPT-3. Then DALL-E, the striking text-to-image model. Nobody was building anything as good. With each release, OpenAI raised the bar for what was thought possible.
Managing expectations
Last November, OpenAI released a free-to-use chatbot that repackaged some of its existing tech. It reset the agenda of the entire industry.