Gemini 2.5 Pro tops rankings with Deep Think and native audio features

Redação Mix Vale

em 20 de maio de 2025

Siga o Mix Vale no GoogleVeja as notícias do Mundo com destaque nas buscas do GoogleAdicionar

Artificial intelligence technology is evolving rapidly, and Google has unveiled significant updates to its Gemini 2.5 model series. The Pro version remains a favorite among developers for coding tasks, while the Flash model sees improvements in efficiency and performance. New features, such as the experimental Deep Think mode and native audio output, aim to expand the models’ capabilities. These advancements solidify Google’s position in the AI market, offering robust tools for businesses and programmers.

Developers now have access to resources that streamline the creation of interactive web applications. The Gemini 2.5 Pro leads key rankings, including WebDev Arena, and excels in educational benchmarks. Enhanced security measures also provide greater protection against digital threats.

The updates include:

Deep Think mode for solving complex math and coding problems.
Native audio output for more natural dialogues.
Flash model improvements, reducing token usage by 20-30%.
Support for open-source tools via MCP protocol.

Google plans to roll out these features widely in the coming weeks, focusing on meeting the needs of developers and enterprises.

Elevated performance of Gemini 2.5 Pro

The Gemini 2.5 Pro has been updated to deliver superior performance in web development applications. With an ELO score of 1415 in WebDev Arena, it outperforms competitors in coding tasks. Its 1-million-token context window enables efficient processing of large data volumes, ensuring effectiveness in complex projects.

Educators praise the model for its integration with LearnLM, a technology designed for learning. In comparative tests, Gemini 2.5 Pro was preferred by pedagogy experts, excelling across five core principles of learning science. The model adapts responses to diverse educational scenarios, providing clear and structured explanations.

Its leadership extends to LMArena, a ranking evaluating human preferences across multiple dimensions. Enhanced video comprehension and long-context capabilities further make it a versatile tool for multimodal applications.

Deep Think redefines problem-solving

The experimental Deep Think mode, introduced in Gemini 2.5 Pro, leverages advanced reasoning techniques to tackle complex challenges. In benchmarks like the 2025 USAMO, the model achieved impressive scores in advanced mathematics. On LiveCodeBench, focused on competitive coding, Deep Think delivered top results, scoring 84% on the multimodal MMMU test.

Google is conducting additional safety evaluations before releasing Deep Think for general use. Currently, the feature is available only to trusted testers via the Gemini API, allowing detailed feedback. The mode is expected to enhance the model’s ability to evaluate multiple hypotheses, ensuring greater accuracy in technical tasks.

Upgrades to Gemini 2.5 Flash

Designed for efficiency and low cost, the Gemini 2.5 Flash received updates that make it faster and more economical. The model now uses 20-30% fewer tokens in evaluations while maintaining high performance in reasoning, multimodality, and coding benchmarks.

Available for preview in Google AI Studio and the Gemini app, Flash will be released for large-scale production in early June. Enterprises using Vertex AI will also access the updated version, optimized for corporate applications.

Key improvements include:

Faster processing for low-latency tasks.
Enhanced support for long contexts.
Improved performance in multimodal benchmarks.
Significant reduction in computational resource usage.

Native audio transforms interactions

The introduction of native audio output in Gemini 2.5 Pro and Flash marks a leap in conversational experiences. The Live API now supports audiovisual dialogues, enabling more natural interactions. Users can customize tone, accent, and speaking style, such as requesting a dramatic narration for storytelling.

The Affective Dialogue feature detects emotions in the user’s voice and adjusts responses accordingly. Proactive Audio filters background conversations, ensuring the model responds only when appropriate. Text-to-speech functionality supports 24 languages with seamless transitions and can simulate multiple speakers in a single dialogue.

Strengthened security against threats

Protection against cyber threats has been bolstered in Gemini 2.5. New safeguards reduce risks of indirect prompt injections, where malicious instructions are embedded in retrieved data. Tests show that Gemini 2.5 is Google’s most secure model family to date.

Enterprises using Vertex AI benefit from these enhancements, particularly in applications involving external tools. The updates ensure greater reliability in corporate settings where data security is paramount.

Expanded computational use

Project Mariner’s computational use capabilities have been integrated into the Gemini API and Vertex AI. Companies like Automation Anywhere and UiPath are exploring these features to develop automated solutions. The functionality allows models to interact more directly with computational systems, broadening their applications in automation and task management.

Google plans to release these capabilities to developers this summer, encouraging large-scale experimentation. Integration with third-party tools has also been simplified, streamlining the creation of automated workflows.

Thought summaries for developers

Gemini 2.5 Pro and Flash now offer thought summaries, a feature that organizes the model’s reasoning in clear formats. Available in the Gemini API and Vertex AI, it includes headers, key details, and information on model actions, such as tool usage.

Developers report that summaries simplify debugging interactions with the model. The clear structure helps identify errors and optimize workflows, especially in complex projects.

Thinking budgets enhance control

Thinking budgets, initially launched in Gemini 2.5 Flash, have been extended to the Pro model. This feature allows developers to control the number of tokens used for processing responses, balancing latency and quality.

In some applications, thinking capabilities can be disabled entirely, reducing costs for simple tasks. The feature will be available for stable production use in the coming weeks, addressing the needs of enterprises seeking efficiency.

Support for open-source tools

Integration with the Model Context Protocol (MCP) has been added to the Gemini API, facilitating the use of open-source tools. Native SDK support enables developers to connect Gemini to MCP servers and other hosted solutions, simplifying agent-based application development.

Google is exploring ways to expand support for external tools, focusing on improving interoperability. The initiative reflects the company’s commitment to meeting the developer community’s needs.

Broader availability for users

Gemini 2.5 Flash is now accessible to all users through the Gemini app. In early June, the updated version will be released in Google AI Studio for developers and Vertex AI for enterprises. Gemini 2.5 Pro will follow the same timeline, with general access planned for the coming weeks.

The expansion ensures more users can explore new features, from native audio to Deep Think. Google continues to gather feedback to refine the models before the full launch.

Feedback drives innovation

The developer community plays a central role in Gemini 2.5 updates. Google maintains open channels for suggestions, which directly influence new feature development. The company also invests in fundamental research to expand model capabilities, focusing on efficiency and performance.

Teams behind Gemini collaborate with safety and education experts, ensuring updates meet high standards. New features are in development, with announcements expected in the coming months.