Calendar apps are essential for productivity but it is hard to differentiate enough to have sustained growth from just the core usage. Y Combinator-backed Superpowered, which is an AI-powered notetaker for your meetings that doesn’t involve recording bots, hit this roadblock and is now pivoting to become Vapi, an API provider so anyone can easily create a natural-sounding voice-based AI-powered assistant.
Superpowered was founded in 2020 by Jordan Dearsley and Nikhil Gupta. But after three years of working on it, Dearsley said the team wanted to work on the more challenging product. The company is not shutting down the initial product as the startup said that Superpowered is profitable — it is in the process of bringing someone in to run it. Y Combinator said in June that more than 10,000 people were using the product weekly, but the company didn’t provide any updated numbers.
To date, Superpowered/Vapi has raised $2.1 in seed money from investors including Kleiner Perkins and Abstract Ventures.
Pivot to Vapi
The company offers Vapi as an API to let developers create a bot using just prompts — it then put it behind a phone number. Additionally, it offers an SDK integration so developers can embed the bot on websites and mobile apps.
Dearsley told TechCrunch over email that the idea to build Vapi stemmed from a personal problem. He had moved to San Fransisco and started missing his friends and family, who were in a different time zone. He built an AI bot attached to a phone number on the other end to talk to someone in order to sort his thoughts.
“I liked it, but I was continually frustrated with how unnatural it was. It wasn’t like talking to a person. The voice sounded off, there would be long delays before it responded, and it would interrupt me while I was speaking.” he said.
“So I kept working on it and going for my walks with it. Eventually, we got fascinated with this conversation problem. It’s really hard to make something feel human. Voice assistants today are clunky and turn-based, we want to build something that feels human.”
Technically, Vapi is currently stringing a bunch of third-party APIs to build a robust voice conversation platform. For instance, it uses solutions from Twilio for telephony, Deepgram for transcription, Daily for audio streaming, OpenAI for responses, and PlayHT for text-to-speech.
ScaleConvo, a startup in the YC winter batch for 2024, is already using Vapi to launch conversational bots for sales teams and property management companies. However, Vapi didn’t disclose its other clients. The company is opening up its API with Vapi Phone and Vapi Web products today.
Challenges for Vapi
One of the biggest challenges the startup has is to reduce latency, according to Magnus Revan, an ex-Gartner analyst and chief product officer at multimodal conversation startup Openstream.ai.
“OpenAI models need between 2-10 seconds to generate an answer – while on the phone the gold standard is to have 700ms between the user finishing talking and then the ‘bot’ starting to talk. And getting to sub 1-second latency with capable models (high parameter count open-source models like LLaMA2 70B) is really hard,” Revan said.
Currently, Vapi has a latency of 1.2-2 seconds depending on various factors. Dearsley expects to bring down latency to under one second in the next month thanks to Vapi’s own work and OpenAI’s improvements.
Mohamed Musbah, an angel investor in Vapi also said that the startup’s solution will improve with overall advances in API.
“As OpenAI and others improve their models, Vapi’s platform will become more powerful, equipped with better knowledge bases, code execution capabilities, and larger context windows. Vapi’s focus on solving the greatest friction areas in voice communication will be its edge as user demand grows for voice assistants,” he said.
However, this puts the onus on the improvement of other solutions rather than Vapi itself. Dearsley said that reliance on other APIs reduces Vapi’s defensibility if big companies start moving into that area. However, the team said that it has an edge in terms of having built infrastructure to handle thousands of calls simultaneously. Dearsley emphasized that with Vapi’s web and phone API launch for the public, the team will also look to build its own models for audio-to-audio solutions.