“Here’s the full paper, which has a lot of details missing from the
summary linked above:
https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf
My personal theory is that getting a significant productivity boost
from LLM assistance and AI tools has a much steeper learning curve than
most people expect.
This study had 16 participants, with a mix of previous exposure to AI
tools - 56% of them had never used Cursor before, and the study was
mainly about Cursor.
They then had those 16 participants work on issues (about 15 each),
where each issue was randomly assigned a “you can use AI” v.s. “you
can’t use AI” rule.
So each developer worked on a mix of AI-tasks and no-AI-tasks during
the study.
A quarter of the participants saw increased performance, 3/4 saw
reduced performance.
One of the top performers for AI was also someone with the most
previous Cursor experience. The paper acknowledges that here:
However, we see positive speedup for the one developer who has more
than 50 hours of Cursor experience, so it’s plausible that there is a
high skill ceiling for using Cursor, such that developers with
significant experience see positive speedup.
My intuition here is that this study mainly demonstrated that the
learning curve on AI-assisted development is high enough that asking
developers to bake it into their existing workflows reduces their
performance while they climb that learing curve.”
LLM driven tools are incredibly powerful, anyone denying that is
huffing copium, but the field hasn’t stabilized enough for devs to
really grok the tools and how they should fit into their workflow.
There’s a new model being released every month, each with a slightly
different prompting style and behavior that subtly impacts their
performance. Before you have a chance to really understand what type of
prompting and context leads to the best performance for a model, a new
one has already been released and you have to start over.
Like any tool, it takes time before the speed increases kick in. I
can say from personal experience that initially when I switched to Vim
my ability to type and edit text took a significant hit. I probably
didn’t exceed my previous speed for weeks. The same is true of using
git. I initially spent hours and hours reading the Git book, and trying
to perform simple commands in git before using the tool was actually a
productivity boost rather than a hindrance when trying to write
code.
Saying: this tool takes time to master and use effectively is in no
way a cop out. It would be surprising if that wasn’t the case. I would
be willing to bet that the majority of power user tools have a steep
initial learning curve.
Changes
fix(links): add direct link to willisons metr comment