Gemini 3.1 Pro is Google’s latest AI model for complex problem-solving, and it is being positioned as a major upgrade in core reasoning rather than just another routine update.
Powerful new benchmark performance
Google says Gemini 3.1 Pro “represents a step forward in core reasoning,” building on the Gemini 3 family but tuned specifically for tasks where simple answers are not enough. A key highlight is its verified score of 77.1% on the ARC-AGI-2 benchmark, which tests whether an AI system can solve entirely new logic patterns rather than rely on memorised examples. The company notes this is more than double the reasoning performance of Gemini 3 Pro, signalling a marked jump in how the model handles unfamiliar, multi-step problems.
Google frames this as a shift towards AI that can support advanced workflows, from breaking down complex topics with clear, often visual explanations to pulling together disparate data into a single, coherent view. It is also pitched as a tool for creative work, such as helping users bring more ambitious projects to life rather than just producing quick, surface-level responses.
Where Gemini 3.1 Pro is available
Gemini 3.1 Pro is rolling out across Google’s ecosystem in a preview phase. Developers can access the model through the Gemini API in Google AI Studio, the Gemini CLI, Android Studio and Google’s agentic development platform, Antigravity. Enterprise customers can use it via Vertex AI and Gemini Enterprise, targeting use cases that demand deeper reasoning over large and varied datasets.
For everyday users, Gemini 3.1 Pro is arriving in the Gemini app and in NotebookLM, with higher usage limits and access tied to Google’s AI Pro and Ultra subscription tiers in some markets. Google stresses that this is still a preview release intended to validate recent updates and gather feedback before wider general availability.
What this upgrade means
The launch of Gemini 3.1 Pro underlines Google’s push to make AI more dependable on hard, unfamiliar problems rather than just fluent at everyday chat. With its improved ARC-AGI-2 score and broader rollout across consumer, developer and enterprise products, the model is being framed as a new baseline for serious, reasoning-heavy tasks, from complex business workflows to demanding research and creative projects.