- The Pulse by 42neurons
- Posts
- GPT 5.4's performance on benchmarks
GPT 5.4's performance on benchmarks
Hey there!
Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

> GPT-5.4 is OpenAI's first model with native computer use - can operate keyboard, mouse & applications directly on the user's behalf
> combines reasoning, coding & professional work (spreadsheets, docs, presentations); individual claims 33% less likely to be false vs GPT-5.2
> GPT-5.4 Thinking lets users tweak or redirect mid-response - no need to start over; rolling out to Plus, Team & Pro
> available now across ChatGPT, Codex, and API; GPT-5.4 Pro for max performance on Enterprise & Edu

> Anthropic now leads US business AI chat spend; Claude Enterprise, Team & Max all surging since mid-2025
> Claude Code writes ~4% of GitHub commits today (projected 20%+ by end-2026); Accenture deploying to 30,000 professionals + 8/10 largest US companies already on Claude
> however, Trump ordered all federal agencies off Anthropic due to latter refusing a DoD autonomous-weapons contract
> DoD officially named Anthropic a "supply chain risk", demanding all defense partners to cease use of Anthropic’s models
> threatens removal from MS Copilot & $60B+ in investor capital; Anthropic to sue in retaliation
> feud became marketing windfall for Claude: #1 App Store & Play Store, signups 4×, downloads +220% overnight
> ChatGPT uninstalls +300% the day OpenAI signed the Pentagon deal instead
> ceiling remains real: Claude holds ~4% of daily mobile users vs ChatGPT's ~42%, free users don't pay & no ads are planned

> according to SemiAnalysis, Anthropic's quarterly ARR additions already exceed OpenAI's, with full crossover projected mid-2026
> OpenAI's economics are tightening: margins fell 40% → 33% (vs. own 46% target), inference costs quadrupled, burn rises $8B → $25B → $57B through 2025-27
> OpenAI turns cash-flow positive only in 2030, currently raising $100B+ at $730B valuation; Anthropic gets there in 2028, constrained by compute supply, not demand
> Pentagon feud is the wildcard: could simultaneously hit Nvidia supply, Microsoft distribution & federal revenue just as Anthropic closes the gap - $60B in capital riding on the outcome