- The Pulse by 42neurons
- Posts
- OpenAI's deep research: performance on expert-level tasks
OpenAI's deep research: performance on expert-level tasks
Hey there!
Welcome back to The Pulse, where we dive into interesting AI stories and trends backed by data, all presented through simple visuals.

Deep research: ChatGPT’s new agentic capability that can do impressively comprehensive online research. According to OpenAI:
> its pass rates correlate more with task's economic value than time taken by humans
> tasks hard for the AI ≠ tasks time-consuming for humans

> only DeepSeek R1 shows comparable performance to OAI’s models
> o3 mini surpasses all models except o1 full, followed by DeepSeek R1
> reasoning models consistently superior across benchmarks

> study with 20 pro comedians (who already use AI) in a workshop
> using ChatGPT and Bard in August ‘23
> general sentiments:
- AI was overall not very helpful
- useful & faster than humans for first draft and structure
- content was bland, generic & unfunny
- jokes were of poor quality