Grok 4 tops Artificial Analysis intelligence index

Hey there!

Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

> Grok beats other frontier models for the first time

> tested using Grok 4 base; Grok 4 Heavy (multi-agent version with enhanced performance) to surpass other models by large margin

> also leading on coding (LiveCodeBench & SciCode) & math index (AIME24 & MATH-500)

> Grok 4 Heavy benchmark performance includes:

  • GPQA Diamond: 88.9%

  • AIME25: 100% (saturated the benchmark)

  • USAMO25: 61.9%

  • Humanity’s Last Exam: 44.4%

> Grok 4 (Thinking) scored 16.2% on ARC-AGI 2

> base model available through subscription for $30/month & Grok 4 Heavy at new $300/month plan

> first profit at ¥1.15T or ~$7.7B as of March this year

> ¥517B profit for January-March: single quarter got nearly half their total annual profit

> using profitable period to fund huge AI bets:

  • $40B OpenAI commitment

  • $6.5B Ampere acquisition

  • leading $500B Stargate data center project

> still lost ~$14B in net asset value (likely on-paper profit)

> 10K → 250K in less than 2 years

> currently more than 1,500 vehicles, to reach 3,500 by 2026

> overtook Lyft's market share in service zones by June + 20% of Uber's despite being 30-40% more expensive