While autonomous driving has long relied on machine learning to plan routes and detect objects, some companies and researchers are now betting that generative AI — models that take in data of their surroundings and generate predictions — will help bring autonomy to the next stage. Wayve, a Waabi competitor, released a comparable model last year that is trained on the video that its vehicles collect. 

Waabi’s model works in a similar way to image or video generators like OpenAI’s DALL-E and Sora. It takes point clouds of lidar data, which visualize a 3D map of the car’s surroundings, and breaks them into chunks, similar to how image generators break photos into pixels. Based on its training data, Copilot4D then predicts how all points of lidar data will move. Doing this continuously allows it to generate predictions 5-10 seconds into the future.

A diptych view of the same image via camera and LiDAR.

Waabi is one of a handful of autonomous driving companies, including competitors Wayve and Ghost, that describe their approach as “AI-first.” To Urtasun, that means designing a system that learns from data, rather than one that must be taught reactions to specific situations. The cohort is betting their methods might require fewer hours of road-testing self-driving cars, a charged topic following an October 2023 accident where a Cruise robotaxi dragged a pedestrian in San Francisco. 

Waabi is different from its competitors in building a generative model for lidar, rather than cameras. 

“If you want to be a Level 4 player, lidar is a must,” says Urtasun, referring to the automation level where the car does not require the attention of a human to drive safely. Cameras do a good job of showing what the car is seeing, but they’re not as adept at measuring distances or understanding the geometry of the car’s surroundings, she says.

Though Waabi’s model can generate videos showing what a car will see through its lidar sensors, those videos will not be used as training in the company’s driving simulator that it uses to build and test its driving model. That’s to ensure any hallucinations arising from Copilot4D do not get taught in the simulator.

The underlying technology is not new, says Bernard Adam Lange, a PhD student at Stanford who has built and researched similar models, but it’s the first time he’s seen a generative lidar model leave the confines of a research lab and be scaled up for commercial use. A model like this would generally help make the “brain” of any autonomous vehicle able to reason more quickly and accurately, he says.

“It is the scale that is transformative,” he says. “The hope is that these models can be utilized in downstream tasks” like detecting objects and predicting where people or things might move next.

Source link