Google releases Gemini update with leap in logic and creation of autonomous visual systems

    Categories: News (EN)
Gemini

Gemini - Mehaniq/shutterstock.com

The Google DeepMind division made official this Thursday (19) the arrival of a new iteration for its main family of artificial intelligence models. The update, called Gemini 3.1 Pro, was developed with a priority focus on expanding complex reasoning capabilities, promising to overcome the limitations of previous versions in tasks that require data synthesis and advanced logic.

The launch occurs at a strategic moment for the technology sector, where simple text generation is giving way to the demand for agents capable of executing complete workflows. The new tool is now available in preview phase for developers and advanced plan subscribers, introducing significant improvements in multimodal processing that ranges from programming codes to video and audio interpretation.

Gemini Inteligência Artificial – Ju Jae-young/ Shutterstock.com

Experts point out that the difference in this version lies in its optimized architecture to solve new problems, moving away from the exclusive dependence on patterns memorized during training. The technology was designed to serve both end users, through the company’s proprietary application, and corporate environments that require robust automation via API.

Internal validation tests demonstrated that the model can maintain coherence in long chains of thought, an essential characteristic for the development of functional autonomous agents. The immediate availability aims to accelerate the integration of these capabilities into third-party products and the company’s cloud platforms.

Performance jump in logic tests

The most significant metric presented during the announcement refers to performance on the ARC-AGI-2 benchmark, a rigorous test designed to evaluate an AI’s ability to solve previously unseen logical patterns. The Gemini 3.1 Pro recorded a score of 77.1% in this regard, a result that represents more than double the performance obtained by its predecessor, the

In addition to the evolution in abstract logic, the model was subjected to direct comparative evaluations with other cutting-edge technologies available on the market. In the test known as “Humanity’s Last Exam”, the new version achieved 44.4% success, surpassing competing solutions developed by

This consistency in logical reasoning allows the tool to be applied in situations where simple information retrieval is not enough. The focus of the update is to ensure the system can navigate multifaceted problems without losing context or hallucinating responses, raising the bar for reliability for professional and academic use.

Autonomy in navigation and virtual agents

The ability to operate as an autonomous agent has been greatly expanded in this update, with impressive results in benchmarks that simulate real professional activities. In the APEX-Agents test, which measures efficiency in long-horizon tasks, the model reached the 33.5% mark, indicating a superior aptitude for managing objectives that require multiple steps to be completed.

Another highlight was the performance in BrowseComp, an assessment focused on agentic internet search combined with the use of programming tools such as Python. Gemini 3.1 Pro achieved 85.9% effectiveness, demonstrating the ability to search, filter and extract relevant information from the web autonomously, integrating this data directly into workflows.

To illustrate the improved capabilities of the new system, the company highlighted three fundamental pillars that support the operation of agents in this version:

  • Ability to maintain focus on complex objectives throughout extensive executions, without deviation from purpose.
  • Smooth integration between web search and code execution for real-time data validation.
  • Prioritization of workflows that require the coordinated use of multiple digital tools simultaneously.

Creating visual systems and coding

The model’s versatility extends to the synthesis of complex visual systems from simple text commands. Durante technical demonstrations, artificial intelligence was able to generate animations in SVG format that are scalable and lightweight, offering an efficient alternative to traditional video formats for web interfaces and mobile applications.

One of the practical examples shown involved setting up a real-time telemetry dashboard. The model processed public APIs and raw data to build, from scratch, a functional interface that visualizes the orbit of Estação Espacial Internacional. The process involved everything from interpreting the input data to coding the final graphical interface.

In the field of creative interpretation, the system transformed classic literary descriptions into modern digital products. By processing excerpts from “The Morro of the Ventos Uivantes”, the AI ​​captured the narrative atmosphere of the book and designed a contemporary portfolio website, translating abstract and artistic concepts into executable code and functional design.

The tool also demonstrated competence in creating interactive experiences in three dimensions. Foi presented a simulation where a flock of virtual birds responded dynamically to the tracking of the user’s hands, proving the model’s ability to integrate computer vision with complex animation logic.

Details about corporate access and integration

The distribution of Gemini 3.1 Pro follows a staggered model, prioritizing developers and corporate customers at this first stage. The version is accessible through platforms such as AI Studio and Vertex AI, allowing companies to test the technology in their own environments and adapt their products to use the new reasoning engine.

For individual users, access was released to subscribers of the Google AI Pro and Ultra plans, which have increased usage limits. The tool was also integrated with NotebookLM, enhancing document synthesis and insight generation functions for paying users who use the platform for research and studies.

A relevant technical point is the maintenance of the 1 million token context window, a feature inherited from previous generations of series 3.