Google is counting on its very own GPT-4 competitor, Gemini, so much that it staged parts of a recent demo video. In an opinion piece, Bloomberg says Google admits that for its video titled “Hands-on with Gemini: Interacting with multimodal AI,” not only was it edited to speed up the outputs (which was declared in the video description), but the implied voice interaction between the human user and the AI was actually non-existent.
Instead, the actual demo was made by “using still image frames from the footage, and prompting via text,” rather than having Gemini answer to — or even forecast — a drawing or change of objects on the table in real time. This is far less impressive than the video wants to mislead us into thinking, and worse yet, the lack of disclaimer about the actual input method makes Gemini’s readiness rather questionable.
It comes as no surprise that Google denies any wrongdoing here, as it referred The Verge to an X post written by Gemini’s co-direct, Oriol Vinyals, which says “all the user prompts and outputs in the video are real,” and that his team made the video “to encourage developers.” Given the industry and authorities’ attention on AI lately, perhaps the tech giant should be more sensitive about its presentations in this field.
Really happy to see the interest around our “Hands-on with Gemini” video. In our developer blog yesterday, we broke down how Gemini was used to create it. https://t.co/50gjMkaVc0
We gave Gemini sequences of different modalities — image and text in this case — and had it answer… pic.twitter.com/Beba5M5dHP
— Oriol Vinyals (@OriolVinyalsML) December 7, 2023