Jonathan's Blog

System Prompts

This is the most comprehensive database of system prompts that I’ve seen so far. The most interesting thing to me are the huge disparities in the quality of the prompts. On the one hand you have Anthropic writing out novels for Claude with massive amounts of instructions, details, and examples, while on the other side you have Google, who’s inability to write a decent system prompt goes a long way towards explaining why I’ve seen such bad performance in their Gemini app.

Writing good system prompts seems to go a long way in improving model performance, especially with tool usage, but it feels like a neglected part of model development and deployment. Prompt engineer may still be a meme job title for now, but engineering a good prompt is still important. I wonder if Anthropic simply has better tooling for testing multiple iterations and versions of prompts to quantify improvements?


Changes