The Pulse by 42neurons
Posts
Kimi K2 Thinking’s benchmark performance

Kimi K2 Thinking’s benchmark performance

November 11, 2025

Hey there!

Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

> Kimi K2 Thinking: first open model to match GPT-5 & Claude 4.5 on core reasoning + coding tasks

> ranks #2 on Artificial Analysis Intelligence Index, just behind GPT-5 (high)

> beats GPT-5 & Claude 4.5 on Humanity’s Last Exam (51 vs 42 top) + BrowseComp (60.2 vs 54.9 top) in strongest agentic showing yet

> 71.3 on SWE-Bench (coding) + 84.6 on MMLU Pro = deep reasoning + broad knowledge coverage

> trained for just $4.6M vs hundreds of millions for closed models

> inference runs ~6× cheaper than Claude & ~10× cheaper than GPT-5 while matching the reasoning range

> MoE (~32B active params) architecture makes it light, fast, & scalable for enterprise deployment too

> planned over next 7 years:

11+ GW in datacenter expansion
$310B+ in cloud usage (additive to capex)
$1T+ in circulation ($1.4T in commitment overall, thru 8 yrs acc. to Altman)

> doubling down on compute expansion for AI scaling; to build multi-GW spine across regions + layer multi-cloud & specialized cloud lanes

> "infinite money glitch" created through investments feeding back as revenue, creating loop that grows valuations & works only if customer reach → paid usage

> vendor financing is normal; fine if demand covers cash flows

> slowly putting into existence Altman's eventual hope of 1 GW (~750K homes worth of energy) added per week

> Q3 2025 revenue hit $1.36B, up ~133% YoY, beating estimates (~$1.28B)

> net loss narrowed to $0.22/share vs. $1.82/share in Q3 2024; adjusted loss just $0.08/share (vs -$0.36 est.)

> lowered 2025 revenue guidance to $5.05-$5.15B (from $5.15–$5.35B) due to data center delay, so shares dropped ~6%

> 2025 capex forecast at $12-14B, expected to double in 2026

> signed $14B deal with Meta + $6.5B with OpenAI, pushing revenue backlog past $55B

> CoreWeave, Lambda, Nscale, Nebius in defining a new cloud category: sovereign AI infra built around GPU supply

> signal: record top-line growth + deferred profits = early hyperscaler playbook for neo-clouds