Price vs performance of non-reasoning LLMs on LiveBench

April 22, 2025

Hey there!

Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

> LiveBench: 17 tasks testing AI across reasoning, coding, math, & comprehension

> red line: models at the pareto front, where no other model exists both cheaper & better

> google models dominate: best performance for price compared to expensive oAI models

> benchmark currently dominated by reasoning models, with o3 scoring highest

> almost 1 in 4 US tech jobs in Jan were AI-related

> in Jan, AI-related jobs: 1.3% of all job posts; tech jobs: 5.4% of all openings

> since ChatGPT launch (Q4 2022) to Q4 2024:

> many industries want tech staff who can build or use AI: finance, consulting, retail, pharma, etc.

> 36% of all IT job posts in Jan were also AI-related

> AI chatbot downloads up 119% YoY in Q4 2024; AI Art Generators up 21% YoY

> ChatGPT dominated market: ~40% of global GenAI app consumer spend & 23% downloads in 2024

> 16 GenAI apps earned $10M+ in IAP revenue in 2024; 25 apps reached >10M downloads

> other popular categories: photo editing/enhancement, beauty editors, video editing & generation, etc.