It was not a great day for Google on the heels of the release of the company’s new large language model, Gemini.
One of the most impressive demos of the new model was a multi-model demonstration using Gemini Ultra.
In the video, Gemini Ultra is shown various objects which it recognizes in real-time.
It first appears quite impressive until you read the notes which say, “For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.”
Many people, including Bloomberg Opinion Columnist Parmy Olson, took issue with this because Google was purporting to show Ultra’s responsiveness.
Google DeepMind’s Principal Scientist Oriol Vinyals countered saying, “The video illustrates what the multimodal user experiences built with Gemini could look like. We made it to inspire developers.”
Really happy to see the interest around our “Hands-on with Gemini” video. In our developer blog yesterday, we broke down how Gemini was used to create it. https://t.co/50gjMkaVc0
We gave Gemini sequences of different modalities — image and text in this case — and had it respond… pic.twitter.com/Beba5M5dHP
— Oriol Vinyals (@OriolVinyalsML) December 7, 2023
This is not the first time that Google has exaggerated its demos.
In 2018, Google CEO Sundar Pichai showcased a new AI assistant called Duplex.
The demonstration showed Pichai using the tool to call a salon to make an appointment, yet it was revealed that the audio had actually been recorded prior to the demo, which caused some embarrassment when it was revealed.
And while exaggerating marketing claims might seem like splitting hairs, some of the other recent reactions of Gemini were harder to explain away.
When a writer from Tech Crunch repeated the prompt, the response was again wrong – but with a different wrong answer.
Reports of these kinds of inconsistencies were also mentioned by others on social media, making many question if Gemini is in the same ballpark as ChatGPT.