Reporting from San Francisco: OpenAI rolled out voice interaction as a premium feature for its ChatGPT mobile apps in September, making it possible for users to speak with the AI assistant, with ChatGPT responding in one of its own voices.
It’s not perfect in my experience, but when it works seamlessly, it feels like a glimpse of the future, or perhaps a scene from the movie Her.
So what are the prospects for integrating ChatGPT voice interactions into other types of hardware, including dedicated devices like smart speakers?
I asked that question during a press briefing with OpenAI CEO Sam Altman and CTO Mira Murati today at the company’s first developer day in San Francisco, and the ensuing exchange led to an interesting teaser from Altman about the possibility of OpenAI offering a device of its own.
“We’d like to figure out something amazing,” Altman said in response to a follow-up question from Ina Fried of Axios. “If there’s something amazing to do, we’ll do it. We don’t know what that is yet. I do believe that every time a new … technology of this magnitude comes along, there’s supposed to be an amazing new computing device.”
Implicit in this is the competitive threat that such a move would pose to Amazon’s Alexa-powered Echo devices, and Google’s Nest smart speakers, among others. Amazon is trying to get ahead in this area with its upcoming Alexa feature, “Let’s Chat,” which the company previewed in September.
OpenAI already partners with a company that happens to be a device maker, Microsoft. Working together on a Surface device with native ChatGPT functionality might make a lot of sense, in theory.
Microsoft CEO Satya Nadella made a surprise appearance with Altman on stage at the event today, making it clear that both companies are committed as ever to their partnership.
None of those broader industry implications came up during the media briefing on this topic.
Murati explained that OpenAI is taking a slow and cautious approach on voice technology, and particularly on voice generation, partnering on lower-risk implementations such as Spotify’s use of the company’s technology to translate podcasts into different languages.
Altman spoke generally about OpenAI’s interest in the intersection of AI and voice.
“It’s very clear to us that people love voice as a way to interact with this,” he said, “and as we think about the magic AI can do in the future, one thing I feel confident about is that voice is going to be a big part of how people use AI.”