Prompt Augmentation: UX Design Patterns for Better AI Prompting

Jakob Nielsen

4 minutes ago17 min read

Summary: Six UX design patterns can help users overcome the AI articulation barrier: Style Galleries, Prompt Rewrite, Targeted Prompt Rewrite, Related Prompts, Prompt Builders, and Parametrization.

AI differs from all previous user interface paradigms by abandoning step-by-step commands in favor of intent-based outcome specification. Users no longer tell the computer how to proceed. They tell the AI what they want it to produce. Vibe coding and vibe design are advanced versions of this new wave of controlling computers.

Articulating your needs in writing can be challenging, even for those with high literacy levels. For example, consider the head of a department in a big company that wants to automate some tedious procedures. He or she goes to the IT department and says, “I want such and such, and here are the specs.” What are the chances that IT delivers software that actually meets the department's needs? Close to nil, according to decades of experience with enterprise software development. Humans can’t accurately state their needs in a specification document. Same for prompts.

Most discussions of AI usability have centered around the “Delivery Gap” between the user’s expressed intent and the results delivered by the AI. Hallucinations are the most notable of these gaps, where AI simply makes up information, though fortunately, hallucinations have declined with larger AI models.

This article addresses a different gap: the “Articulation Barrier” between what the user truly needs and the extent to which he or she can articulate this need and make it explicit to the AI. After all, AI is still not a mind reader, so it can only deliver an outcome that aims to satisfy the user’s stated intent, not the user’s unstated needs or desires.

Two problems arise in getting what you want from AI: First, the user must overcome the articulation barrier and accurately inform the AI of her intent. (Here, the desire for a discus thrower statue.) Second, the AI must overcome the delivery gap between this intent and the actual outcome. (Reve)

Since the beginning of the modern AI era, I have referred to this latter problem as the articulation barrier in using AI. The very fact that users have to articulate their needs in prose gives an advantage to the 5% of the population with extraordinarily strong literacy skills. The remaining 95% of the population have always struggled with elegant writing, but this skill gap was not a serious handicap since most people don’t aspire to have their prose published in literary magazines.

About half of the adult population in advanced countries like the United States and Germany is classified as “low-literacy” users. While most can read simple texts, they struggle with writing complex, descriptive prose. Even among those with higher literacy levels, writing new descriptive prose is more challenging than reading and understanding existing prose.

This creates what I term “low-articulation users” — people who may have clear intentions but struggle to express them in writing. The early emergence of prompt engineers who specialized in writing the necessary text to make AI produce desired outcomes is empirical evidence of this barrier. When prompt engineering is seen as a specialized job, it suggests that many business professionals can’t articulate their needs sufficiently well to use current AI user interfaces successfully for anything beyond the most straightforward problems.

Changing from written prompts to spoken interactions probably alleviates the articulation barrier somewhat, particularly for older users. The back-and-forth inherent in an “advanced voice mode” conversation with AI makes it easier for less-literate users to eventually express their intent, as the AI can piece together what they want from elements of multiple utterances. We know from the real world that even people with modest intellects are usually capable of expressing their desires through conversation, even if it takes them several half-baked attempts to get where they want to go.

However, even in spoken conversation, it’s hard for users to articulate exactly what they need. To make this easier, a new set of UI design patterns has been introduced recently in many AI tools: Prompt Augmentation.

Definition: Prompt augmentation features enable users to refine or modify their inputs, allowing the AI to better capture their intent. Users are no longer limited to the prompt text (written or spoken) they can produce from their brains, but get a richer prompt with AI’s help. This helps overcome the articulation barrier in using AI.

6 Design Patterns for Prompt Augmentation

Prompt augmentation is particularly helpful in multimodal AI systems, or when generating media forms that are inherently difficult to describe in words, such as images, video, music, and voices. At the very least, these media forms are difficult to describe for the broader group of creators enabled by AI. Full-time media professionals have long evolved specialized vocabularies within their respective fields: see, for example, a list of terminology to direct camera movements in film cinematography. While professionals know these terms, you can’t expect a domain expert who simply wants to make a few avatar clips for YouTube to know the difference between a “boom shot” and “pullback dolly shot.”

Image-generator prompts must capture not only a detailed description of the subject but also abstract concepts related to the desired visual aesthetics. Users may be able to visualize images in their minds but lack the vocabulary to write the required prompt.

For this reason, image generation platforms were among the first to implement robust prompt augmentation features in the form of Style Galleries, as shown by the following examples. Note that a style gallery doesn’t show the result of the user’s prompt. Rather, it visualizes different creative directions and offers a vocabulary (and possibly categorization) for users to understand these directions.

This is one of the more elaborate versions of a style gallery prompt augmentation feature. Most image tools will only show the user a single image. (Runway)

A more common layout for a style gallery prompt augmentation feature, from Freepik. Note how the very separation of the visual effects into three categories (color, camera, lighting) helps users understand their options.

Even though Style Galleries are most compelling for visual content formats, they can be used for any media form. For example, a text generation AI could provide a list of short snippets in each of multiple writing styles, serving as inspiration for users to expand their stylistic repertoire.

The most basic form of prompt augmentation is the Prompt Rewrite. In this design pattern, the AI takes a prompt as provided by the user and enhances it to be richer and more detailed. Since AI can’t read the user’s mind, this rewrite may not be exactly what the user wanted. But often the result will indeed be richer and more interesting than the simplistic version described by the user’s initial prompt.

Even when the user is dissatisfied with the AI’s rewrite, the user can subsequently reissue this rewritten prompt after making the necessary modifications to more closely approximate the goal. It is much easier to edit existing text than to create text from scratch.

As an example, here is what I got from Leonardo’s “prompt enhance” feature after entering the prompt “A robot looking puzzled because it doesn't understand the user's request.” Leonardo modified this to “a highly detailed, realistically lit 3D render of a futuristic robot, its metallic body gleaming with a mesmerizing chrome finish, sporting vibrant neon blue and orange accents, its expressive digital eyes wide with confusion, wrinkles creasing its synthetic forehead as it tilts its head to the side, its mechanical limbs frozen in mid-gesture, as if paused in a state of puzzlement, set against a contrasting backdrop of deep, electric blue, evoking a sense of excitement and dynamism, with an exaggerated, almost cartoonish style, reminiscent of Pixar's CGI mastery, brought to life with the photorealistic capabilities of Unreal Engine, Octane Engine, and VRay.”

A puzzled robot drawn based on a prompt rewritten by AI. (Leonardo)

Let’s say I didn’t like the chrome finish, or the orange neon, or the electric blue background. Maybe I want the scene set in a futuristic science lab. Now that Leonardo described these image details, I can easily edit them to my preference.

Leonardo also offers a more Targeted Prompt Rewrite, called “edit with AI” where the user can specify a direction in which to take the current prompt. For example, starting with my original prompt for “ A robot looking puzzled because it doesn't understand the user's request,” I used the edit command “add dramatic lighting” to create the following augmented prompt: “A robot looking puzzled, standing in a dimly lit room with a single spotlight shining down on it, casting a sharp shadow on the wall behind, its bright LED eyes glowing with confusion as it struggles to comprehend the user's request.” Fewer additions than created by the “prompt enhance” feature, and targeting the specific modifications requested by the user. Again, let’s say that I don’t like the idea of “LED eyes glowing” — it’s a simple matter to edit the new prompt to remove the reference to LED eyes and retain more natural-looking eyes.

Outcome of using targeted prompt rewrite to add dramatic lighting. I don’t like this as an illustration of my original intent (a puzzled robot), so I would edit the rewrite and get a different kind of dramatic lighting. (Leonardo)

In some cases, AI can determine a list of Related Prompts that a user has a decent probability of wanting to issue as a subsequent step. In these cases, listing these Related Prompts in a menu offers both the convenience of one-click access and the prompt augmentation benefit of making the user aware of these additional opportunities.

Perplexity’s follow-up questions are a great example of a Related Prompts feature. Perplexity’s CEO, Aravind Srinivas, stated that user engagement with their service doubled after the introduction of this UI feature.

The follow-up questions in Perplexity are a prompt augmentation feature that provides one-click access to prompts that the user is likely to be interested in, given his or her initial prompt. In this case, somebody asking about me might also be interested in my 10 usability heuristics or my books — or even my patents (something many people might not think to ask about without this list of suggested additional prompts).

We’ve long known a design pattern for constructing database queries and product filters from a set of dropdown menus, where users would gradually select operators (such as “cheaper than”) and values (such as “dress” or “jacket”) from a set of dropdown menus. This pattern is often known as a Query Builder. AI has its equivalent of this design pattern in the form of Prompt Builders.

A simple prompt builder in Kling 1.6 allows the user to specify various camera movements from a dropdown menu.

Google’s Imagen offers an elaborate Prompt Builder for designing images, as shown in the screenshot below. Google seems to limit each dropdown to 4 choices, which makes the UI faster to use and the options easier to understand, at the cost of limited expressiveness.

Prompt Builder in Google’s Imagen 3 tool. At the bottom of the prompt pane is a list of additional builder menus that can be added to the prompt for even more variations.

Earlier versions of Imagen’s Prompt Builder tool also auto-created menus with alternates for some of the original keywords in the user’s prompt. For example, “London” might be replaced with a menu of other major European cities (e.g., “London, Paris, Rome, Madrid”), respecting the user’s initial entry by making it the default. I am not sure why this additional idea-generating facility has been removed. Even in the case where the user’s preferred alternative (say, “Copenhagen”) was not a top choice and thus not included as a one-click menu selection, it would always be possible for the user to override the menu and hand-type a different choice. Providing more menus for commonly-varied prompt elements is a way to push the user’s creativity and thus lowering the articulation barrier.

Abandoning the text-only prompt format, Parametrization utilizes a hybrid UI that incorporates traditional GUI elements, varying the prompt along relevant dimensions with scales. The parameters are typically displayed in the UI as graphical user interface (GUI) sliders. This visualization of each dimension allows users to understand their options and how to vary the outcome easily — at least as long as the dimensions are relatable and the endpoints labeled in user-centric terms.

Parameterized UI from the ChatGPT Canvas. When editing a document, a menu shows various ways ChatGPT can modify it, such as adding emojis, changing the reading level, and expanding or contracting the word count (left screenshot). In the right screenshot, I have selected “Reading level” from the previous menu, which reveals a slider with different reading levels.

In the ChatGPT example above, the naming of the various reading levels allows the user to understand the “reading level” scale, which stretches from “Kindergarten” at the bottom to “Graduate school” at the top.

Requesting a rewrite at the “Middle school” level results in the following version of some of the text I wrote about Prompt Builders: “In Kling 1.6, there's a simple prompt builder that lets users pick different camera movements from a menu. Google's Imagen has a more detailed Prompt Builder for creating images. Each dropdown menu in Imagen usually has only 4 choices. This makes the menus quick and easy to use, but it also means there aren't as many options.”

As mentioned, Parametrization only works when users understand the parameters. Compare with the following example of voice design in ElevenLabs:

Voice design parameter panel from ElevenLabs.

ElevenLabs is a text-to-speech (TTS) system, and some speech parameters are easy to understand, such as the speed. However, the other parameters seem based on the internal system function — how does the software vary voices — and not on any user-centered descriptors of how the voice sounds.

Through trial and error, I discovered that “Stability” seems to influence how excited the voice sounds. Lower stability equals more excitability in the voice. So when designing the voice for a “YouTube influencer” avatar, I crank stability down to the minimum, whereas I set it high when designing the voice for a “business professional” avatar who explains one of my articles.

ElevenLabs suffers from the UX difficulty that voice design is an unknown for most users. The articulation barrier is particularly steep when people have a limited understanding of the ways voices differ, and rarely know how to describe these differences, except for terms like “barytone” vs. “tenor” for pitch — if we’re lucky and the user is an opera fan.

The 6 UX design patterns for AI prompt augmentation discussed in this article.

Benefits of Prompt Augmentation Design Patterns

The design patterns discussed above offer these usability advantages over a plain empty-box prompt UI:

First and foremost, they lower the articulation barrier by making it easier to construct advanced prompts. This again broadens the scope of the user’s effort and empowers them to create better results.
Many prompt augmentation design patterns lower the physical effort to create a prompt by replacing typing (or speaking) with one-click selections from menus.
Many prompt augmentation patterns leverage Usability Heuristic 6, “Recognition Rather than Recall,” by showing options to the user rather than requiring the user to remember the vocabulary for those options.

It’s currently unknown whether prompt augmentation is mainly helpful for new users or less skilled users. It’s hypothetically possible that as users gain AI experience, they may grow to depend less on prompt augmentation because they will have learned how to specify their intent to AI. For example, I have learned that I often get music that’s more to my taste in Suno by including the keyword “upbeat” in the music style prompt.

I suspect that prompt augmentation will still prove useful for advanced users, even as they become more experienced. For one, lessons from past generations of computer systems show that most users asymptote at a rather low level of skill, even after years of use. Most people are not interested in learning about computers but simply want to get their work done and log out. (I suspect that you, Dear Reader, may be one of those exceptions who enjoys new computer skills. If so, remember that you are not the user and that most people stop learning after acquiring the minimal skill level needed to accomplish their task.)

Future Directions for Prompt Augmentation

I regret to employ the cliché that “more research is needed,” but that’s clearly the case for prompt augmentation. First, we need foundational research on the design patterns I’ve already identified, as well as generative research to invent additional design patterns. (It defies the imagination that we already know all the good design patterns for prompt augmentation after only slightly more than two years of widely-used AI tools.) Second, we need research on specialized topics, such as the extent to which prompt augmentation is useful for advanced users or whether they abandon it.

Much of this research can be conducted by competent master’s students, although some may require a Ph.D. project. However, since the study of prompt augmentation is still new, there should be a lot of low-hanging research fruit to be plucked in fairly limited research projects.

There is a lot of low-hanging fruit to be had when conducting research on a new concept like Prompt Augmentation that has not already been studied to exhaustion. (Reve)

One particularly promising research direction is the integration of multiple augmentation types into cohesive systems that address different aspects of the articulation barrier. For example, a system might combine multimodal input, hybrid interface elements, and prompt analysis features to provide a comprehensive solution that adapts to different users' needs and preferences.

Another direction for growth is the personalization of augmentation features based on individual users’ skills, preferences, and interaction patterns. A system that learns which prompt augmentation features are most helpful for a specific user could provide increasingly tailored support over time.

A more social form of prompt augmentation comes from community knowledge. Users can share successful prompts, rate them, or even sell them (e.g. prompt marketplaces). This isn’t a direct algorithmic augmentation of a single prompt, but it’s a feature that augments users’ skill by exposing them to how others articulate requests. For example, prompt-sharing communities on Midjourney or Stable Diffusion forums allow users to discover useful phrases (like specific artist names or lighting terms) to include in their own prompts. There are likely to be significant opportunities in building both products and features for social prompt augmentation.

As AI becomes faster with improved hardware and enhanced algorithms, we will also receive more support for real-time prompt iteration, where AI provides real-time feedback on users’ prompts as they are being typed. This proactive feedback could involve suggesting improvements, highlighting potential ambiguities, or even predicting the likely outcome of the current prompt, allowing users to make adjustments before the AI fully processes the request and generates the final output. Similar to grammar and spell checkers, an “AI prompt assistant” could offer inline suggestions and guidance as users construct their prompts, helping them learn to craft prompts more effectively and avoid common pitfalls. Leonardo already has a “Flow” option for image generation that feels more like a video game than traditional prompting, as users guide the design direction for additional images by clicking preferred thumbnails as they are being generated.

Prompt Augmentation Categories

Classifying prompt augmentation features into categories can help us invent new design patterns, by clarifying holes (that become opportunities). Here are some main types of prompt augmentation features.

Intent Clarification Features

Intent clarification features help users better express their true intent when their original prompt may be unclear or incomplete. These features essentially serve as translators between what users can easily articulate and what they actually want to achieve.

Prompt Expansion is a classic example. By using AI to automatically expand and enhance user prompts with additional details, it bridges the gap between simple user inputs and the detailed prompts needed for optimal results.

These features address the articulation barrier by reducing the precision required in the initial prompt. Users can start with a rough approximation of their intent and rely on the feature to help clarify and refine it.

Prompt Enhancement Features

Prompt enhancement features take a basic prompt and augment it with additional details, descriptive elements, or technical specifications. Unlike intent clarification features, which may substantially reinterpret the prompt, enhancement features typically preserve the core intent while adding supplementary information.

Targeted Prompt Expansion is one such design pattern.

These features address the articulation barrier by supplying vocabulary and terminology that users might lack. They're particularly valuable for users who know what they want but struggle to describe it with the level of detail necessary for optimal results.

The effectiveness of these features depends largely on their ability to add relevant details without distorting the user's original intent. The best implementations strike a balance between enhancement and fidelity, adding richness without changing the core concept.

Alternative Exploration Features

Alternative exploration features help users discover and explore different interpretations or directions for their prompts. These features support divergent thinking and creative exploration, helping users consider possibilities they might not have initially imagined. Related Prompts are one such design pattern.

Leonardo AI’s "New Random Prompt" generates completely new, random prompts for inspiration, while Midjourney’s “Describe” feature provides multiple alternative prompt options based on an uploaded image.

Many AI tools provide lists of currently popular topics or highly rated creations by other users, serving as inspiration.

Ideogram’s top images of the week, as measured by the number of “likes” received by each image. In this case, I might like the style of the woman reading under a tree enough to expand the image to see what prompt was used to create this image. I could then reuse style descriptors from that prompt to make a new image that shows something completely different.

These features address the articulation barrier by reducing the need for users to imagine and articulate alternatives themselves. They're particularly valuable in creative contexts where users might benefit from exploring unexpected directions.

The key challenge for these features is balancing novelty with relevance. The alternatives need to be different enough to provide new inspiration but still connected enough to the user's original intent to be useful.

Multimodal Input Features

Multimodal input features allow users to express their intent through means other than text prompts. These features recognize that text is not always the most natural or effective way for users to communicate their ideas.

Midjourney's image style feature allows users to upload an image and use it as the requested style for newly generated images, effectively translating visual concepts into prompts.

Style Galleries, implemented in various forms across platforms, allow users to select artistic styles from visual examples rather than having to know and articulate specific style terminology.

These features address the articulation barrier by allowing expression through more natural or intuitive means than requiring a large active vocabulary within specialized domains. They recognize that different users have different strengths and preferences when it comes to communication.

The effectiveness of these features depends on the accuracy of the translation between modalities. An image-to-text feature that misinterprets the key elements of an image, for example, might actually increase rather than decrease the articulation barrier.

Hybrid Interface Features

Hybrid interface features combine text-based prompting with graphical user interface elements. These features recognize that while natural language offers flexibility, graphical interfaces can be more accessible and require less cognitive effort for many tasks.

Kling’s motion brush is an example: Kling provides a GUI for selecting one or more objects in the image that it will animate and then indicate how those objects should move in the video. The remainder of the prompt (beyond where the objects move) is still specified by text.

Kling’s motion brush feature. I select the bird and specify in which direction I want it to fly. Much easier to do graphically than by using words.

GrammarlyGo’s tone selection allows users to select tone through emoji buttons rather than having to describe the desired tone in text.

These features reduce the cognitive load associated with pure text-based interaction. They leverage the principle of recognition over recall, making it easier for users to select from presented options rather than having to generate descriptions from memory.

The challenge for these features is to maintain the flexibility and power of natural language while incorporating the accessibility of graphical interfaces. The most effective implementations combine the strengths of both approaches without sacrificing the benefits of either.

Prompt Component Libraries

Prompt component libraries provide pre-built components that users can incorporate into their prompts. These features recognize that many prompting tasks involve common elements that can be standardized and reused.

The Prompt Builders from Imagen and Kling shown above are examples of this category.

These features address the articulation barrier by providing expert-crafted prompt components that users can assemble rather than having to create everything from scratch. They're particularly valuable for specialized domains where effective prompting requires domain-specific knowledge or terminology.

The effectiveness of these features depends on the quality and relevance of the components provided. A library of outdated or overly generic components might not significantly reduce the articulation barrier for users with specific needs.

Does Prompt Augmentation Save the Day?

Probably not. The articulation barrier remains, but users who embrace prompt augmentation will make better use of AI. All we can do is lower the barrier, and the better the prompt augmentation features we invent, the lower the articulation barrier becomes.

Prompt augmentation features are just that: features in the UI of an AI. This means that they are subject to all the standard usability problems inherent in adding features to a design: discoverability, learnability, and efficiency of use. For now, most AI systems have sufficiently few prompt augmentation features that they can be displayed prominently in the UI, offering good discoverability. (A discoverability downside is that these features are currently not standard, and thus not expected: users won’t know to look for them.)

It almost goes without saying, at least in my newsletter, but new prompt augmentation features should be subjected to the same user-centered design lifecycle, including user testing, as any other design element you consider adding to your UI.

In general, I believe prompt augmentation has a strong future as a part of the AI user experience. There is room for many new such features to be invented, which will further empower users in getting what they want from AI.

Prompt Augmentation: UX Design Patterns for Better AI Prompting

6 Design Patterns for Prompt Augmentation

Benefits of Prompt Augmentation Design Patterns

Future Directions for Prompt Augmentation

Prompt Augmentation Categories

Intent Clarification Features

Prompt Enhancement Features

Alternative Exploration Features

Multimodal Input Features

Hybrid Interface Features

Prompt Component Libraries

Does Prompt Augmentation Save the Day?

Recent Posts

Top Past Articles

Design Leaders Should Go “Founder Mode”

4 Metaphors for Working with AI: Intern, Coworker, Teacher, Coach

Dark Design Patterns Catalog

UX Angst of 2023-24

Jakob’s Law of the Internet User Experience

Ideation Is Free: AI Exhibits Strong Creativity, But AI-Human Co-Creation Is Better

The 10 Usability Heuristics Reimagined

UX Needs a Sense of Urgency About AI

AI Is First New UI Paradigm in 60 Years