Summary: AI transforms creativity, lifting every writer’s potential, as measured in a new study of people writing short fictional stories with and without AI story ideas. The research found that AI boosted imagination the most for writers with the lowest inherent creativity scores, thus narrowing the gap between humans with high and low creativity.
Think AI is just for data and numbers? Think again! New research shows AI can supercharge creativity, helping writers craft amazing stories. One more nail in the coffin for doomers claiming that creativity is the sole preserve of humans.
Creative writing of fiction short stories was the domain for new research into AI’s impact on creativity. (Midjourney)
A new study confirms what we already know about AI from much previous research:
AI makes users more creative.
AI narrows skill gaps between high-performing and low-performing users — this doesn’t create identical performance, just a smaller difference than before.
AI is a forklift for the mind: it helps us lift cognitive burdens, resulting in more creative work. In a physical warehouse, forklifts allow weaker workers to handle heavy boxes, putting them on a more equal footing with strong workers. Similarly, AI helped the less creative writers the most in the new research study. (Leonardo)
I am very pleased to report on such research replication. One study can always be wrong or misleading, but when more and more research all has roughly the same conclusion, we can start to believe in this conclusion. This is even more true when the various research projects are conducted by different scientists at different institutions, using different methods, and working in different domains.
The new research is by Anil R. Doshi and Oliver P. Hauser from the University of Exeter Business School in the UK. (I have nothing against business school professors, as proven by my reporting on their research, but I continue my lament that there’s so little credible work by UX researchers on the impact of AI.) Their paper is titled “Generative AI enhances individual creativity but reduces the collective diversity of novel content.”
Experimental Conditions
The domain was writing very short fictional stories spanning only 8 sentences. As an example, here’s a 7-sentence story I asked Claude 3.5 Sonnet to write about Jakob’s Law:
“Olivia's hands trembled as she stared at her computer screen, the harsh glow illuminating the tears streaking down her face. As the newly hired UX designer for a failing e-commerce startup, she knew her job—and the livelihood of her colleagues—hinged on her ability to turn things around. Desperate and overwhelmed, she stumbled upon Jakob's Law during a frantic late-night research binge, and suddenly, everything clicked into place. With renewed determination, Olivia poured her heart and soul into redesigning the website, drawing inspiration from familiar e-commerce giants while battling skepticism from her superiors. On the relaunch day, she held her breath as the first sales trickled in, then gasped as they snowballed into an avalanche of orders. As revenue doubled within months, Olivia's colleagues lifted her onto their shoulders, chanting her name, while she wept tears of joy and relief. Her understanding of Jakob's Law hadn't just saved the company, it had given her a family and a home.”
(I prompted for an 8-sentence story to replicate the study, but Claude gave me 7. IA can’t count. But since these sentences are rather long and convoluted, the total story ended up being longer than most of the sample stories reproduced in the paper. Anyway, it gives an idea of the amount of fictional action that can be included in the type of short-short stories that were the target of the present research.)
The researchers had roughly 100 people in each of 3 experimental conditions:
Write a short-short story based on only a theme provided by the experimenters, such as “an adventure in the jungle,” but no help from AI. (This was the “Human-only” condition.)
Write the story based on the theme plus one story idea from AI, which would provide a 3-sentence description fleshing out the assigned theme. (“Human +1AI” condition.)
Write the story based on the theme plus up to 5 story ideas from AI. The average writer requested 2.55 AI ideas. (“Human +5AI” condition, though note that most writers requested less than the offered 5 story ideas.)
These roughly 300 stories were then scored by independent judges, who did not know what experimental condition the author had been assigned to.
Results: Creativity Up 5.4% to 8.1% with AI
The judges scored the stories by writers with one AI idea 5.4% higher for creativity than the stories by human-only writers (p=0.02). The stories from the Human+5AI condition scored 8.1% higher for creativity than the human-only stories (p<0.001).
Writing improved when authors were inspired by AI story ideas. (Leonardo)
Giving human writers access to story ideas from AI thus improved their writing's creativity, and more AI-provided ideas helped more than a single AI idea. This confirms my old advice to always ask AI for multiple ideas.
Before writing their stories, the test participants were given a standard creativity test: to list 10 words that are as different as possible. Unsurprisingly, writers with higher creativity scores produced more creative stories. However, similar to previous research, the writers with the lowest inherent creativity received a much bigger boost to their stories’ creativity than writers with high inherent creativity. AI narrows skill gaps, also in creativity.
AI narrows the gap in human performance so that people who would have trailed far behind without AI almost catch up with the leaders. (Midjourney)
AI Use Possibly Increases Groupwide Similarity
The study authors make much hay from their last finding which I deem to be less credible: that even though AI story ideas increased the creativity of each individual writer, they reduced the overall variability in the story pool.
The scientists scored the similarity of the stories (ironically, by using OpenAI’s embedding API to convert each prose string into a many-dimensional vector in semantic space — they don’t say whether they used the 1,536-dimensional model or the 3,072 dimensional model).
Compared to the stories from the human-only condition, stories written by human+1AI were 10.7% more similar, and stories written by human+5AI were 8.9% more similar.
I don’t think this finding has profound practical implications. First, it’s very rare in business to have the same problem solved by several different people. Thus, it won’t matter how much variability there is in the solutions produced by multiple people since almost all problems are only solved once. All that matters is whether the one solution is better (which was the case with AI assistance, according to the judges’ scores of the stories in this study). The main exception might be in brainstorming scenarios where you start with a phase of having all the participants work alone, before they combine their ideas.
Even the brainstorming scenario will not be as bad in practice as in this study. The study participants did not issue their own prompts to the AI to generate story ideas. Instead, the software used to administer the study always issued the same prompt to the AI. Issuing the same prompt hundreds of times will absolutely result in many different ideas, but not nearly as diverse ideas as would result from hundreds of different prompts. In the normal case, in real work, each employee would be prompting the AI independently. This again means that a group of employees would be issuing multiple different prompts, even if they were attempting to solve the same business problem. Prompt diversity is almost guaranteed to reduce outcome similarity.
If we stick to the domain of writing fiction, prompt variability is even more ensured. Writers are a notoriously independent crowd, so they would all be using their own prompts.
If everybody follows the lead of the same AI prompt, you will get less varied outcomes. For broader creativity, let each person generate prompts independently. (Midjourney)