Alphabet Inc.’s Google unveiled a new version of its powerful Gemini artificial intelligence model that it says can process more text and video than competitors’ products.
Gemini 1.5 has many improvements. The Gemini 1.5 Pro, which will power many of Google’s services, beats the Gemini 1.0 Pro by 87% in tests, putting it roughly on par with the high-end Gemini 1.0 Ultra. When creating a new model, the increasingly popular “Mixture of Experts” (MoE) approach is used, which implies that when sending a request, only part of the overall model is launched, and not the whole. This approach should make the model faster for the user and more efficient for Google.
But there’s one new thing about Gemini 1.5 that everyone at Google, starting with CEO Sundar Pichai, is especially excited about. The new version of the neural network has a huge context window, which means it can process much larger queries and view much more information at once. The window size is 1 million tokens, which is much larger than the 128,000 tokens for OpenAI’s GPT-4 and the 32,000 for the current Gemini Pro. “It’s about 10 or 11 hours of video, tens of thousands of lines of code,” Pichai noted. He also added that Google researchers are testing a context window for 10 million tokens – this is, for example, the entire Game of Thrones series in one request.
The company said developers can explore Gemini 1.5 Pro using Google AI Studio, while select cloud customers can access the AI model in a private legacy version on its Vertex AI enterprise platform.