Gemini 3's performance on benchmarks

Hey there!

Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

> massive gains in performance across benchmarks, almost unanimously considered best model now

> ranking #1 on many aggregate benchmarks (LMarena, Artificial Analysis, ARC AGI 2, etc.)

> announced usage scale:

  • AI Overviews used by 2B people per month

  • Gemini app at ~650M monthly active users

  • most Google Cloud customers (≈70%) now using Google’s AI tools

  • 13M+ devs have built with Gemini

> highest scores on SimpleQA, MMMU, & Video-MMMU; core to real-world reliability on factual accuracy + multimodal reasoning + video understanding

> Deep Think mode gives big jumps on hardest reasoning tasks: 93.8% GPQA, 45.1% ARC-AGI-2, 41% HLE

> live today across Gemini API, Vertex AI, Workspace & Android

> announced Antigravity = Google’s agent platform (planning, tools, memory, full task automation)

> founded in 2022, already scaled to 300 employees

> 3x valuation & 2x ARR in 6 months

> funded by Accel, Thrive Capital, a16z, DST, Nvidia, Google, Coatue, etc.

> fastest ARR scale vs. competitors + most-funded AI-native coding startup ($3.3B total)

> over $1B ARR, in comparison to OpenAI's & Anthropic's 2025 aim of $20B & $9B

> fastest company to $1B ARR across AI coding, generic dev-tools & even huge tech firms

> as of May, majority revenue from subscriptions, but enterprise revenue grew 100x YTD 2025 (Business Wire)

> 84% devs using AI-coding tools (51% daily) this year as per Stack Overflow, with Cursor ranking 3rd most popular tool (Business Insider)

> Cursor tool reached $1B in 2 yrs since launch - still not as fast as ChatGPT (<1 yr)