Google DeepMind's Gemini Ultra 2 Achieves Human-Level Reasoning

Google DeepMind has released Gemini Ultra 2, claiming the model achieves human-level performance on complex reasoning benchmarks for the first time in AI history.

Benchmark Results

Gemini Ultra 2 scored 92% on the ARC-AGI benchmark, surpassing the human baseline of 85%. It also achieved record scores on mathematical reasoning, code generation, and scientific analysis tasks.

The model uses a novel architecture combining transformer networks with symbolic reasoning modules.

Skepticism Remains

Some AI researchers caution that benchmark performance doesn't equate to general intelligence, noting the model still struggles with certain common-sense reasoning tasks.

ARC-AGI score: 92% vs human 85%
Available through Google Cloud API
Powers updated Bard and Search experiences

Google DeepMind's Gemini Ultra 2 Achieves Human-Level Reasoning

Benchmark Results

Skepticism Remains

Share This Article

Related Articles

OpenAI Unveils GPT-5 With Real-Time Video Understanding

GPT-5 Is Here: What Changed and Why It Matters

Claude vs GPT-5 vs Gemini: AI Model Comparison for 2026