OpenAI is showing off Sora, its first text-to-video generative AI model. Sora can turn a simple text prompt or image into a minute of high-res video. It can also “extend” or insert frames into existing videos. However, OpenAI is still deciding whether to offer Sora as a product.
This isn’t the first text-to-video AI, but it may be the most impressive. Generative videos from Google and Meta are low-resolution, choppy, and excruciatingly nightmarish. Meanwhile, the Sora model produces 1080p video at a smooth frame rate, and its output may be mistaken for real video.
Early examples of Sora’s output are available on the OpenAI website. Based on these examples, we can see that the AI has a decent grasp of human body proportions, photorealistic lighting, and creative cinematography. Sora is also good at drawing realistic animals, and it can imitate the imperfections of old film.
Of course, Sora’s output is far from perfect. All of its subjects have a strange weightless quality, and if you look closely, you’ll find some of the tell-tale quirks of AI image generation. OpenAI admits that Sora can be hit-or-miss and provides some “bad” examples of the AI’s output, including a video of a man running backward on a treadmill.
The Sora model has a “deep understanding of language” and can “express vibrant emotions” in its output. However, Sora doesn’t require a long or complicated prompt. Some of the examples provided by OpenAI are based on open-ended, single-sentence prompts. It’s not all that different from ChatGPT’s image generation feature.
Unfortunately, OpenAI hasn’t shown off Sora’s image-to-video capabilities. We’re also curious about the AI’s video-extending and frame-insertion features—if these features are effective, Sora could be a useful tool for video editing or restoration.
We also know very little about Sora’s training data. OpenAI says that it used approximately 10,000 hours of “high-quality” video, but that’s all. More information may be revealed in OpenAI’s Sora white paper, which will be published by the end of January 15th.
In any case, Sora needs to overcome several hurdles before it becomes a real product. OpenAI is consulting “policymakers, educators, and artists” to better understand the public’s concerns. It’s also working with experts who can measure Sora’s potential for “misinformation, hateful content, and bias.” If OpenAI decides to take Sora public, the AI’s output will be accompanied by C2PA metadata for easier identification.
Source: OpenAI