Summary: One-day online conference on using AI in UX design | Using AI to repurpose content into additional formats | Cheatsheet with AI prompts for user research | User research reports experience an AI-fueled renaissance | Worst UI ever | Flexible displays | Scaling law for physical AI | Podcast summarizing keynote | New French AI model
UX Roundup for November 25, 2024. (Ideogram)
One-Day Online Conference on Using AI In UX Design
On December 4, ADPList is hosting a thrilling one-day online conference focused on AI in product design. 🤖🎨 Useful AI is now old enough to have a conference where experienced UX practitioners relate what they have done and the emerging best practices for AI in UX. 🤖🎨
Great program with industry expert speakers who’ve been there, done that, covering two pivotal angles:
Harnessing AI in the Design Process 🛠️
Design Patterns for AI-Powered User Interfaces 💡
It’s the perfect way to wrap up 2024 and gear up for an innovative 2025. Elevate your UX skills and become proficient in using AI! 🚀
Get your ticket now: only USD $29 for the full day. 🎫
Use AI in the UX design process and employ the emerging AI design patterns in your user interfaces. The two topics covered at the ADPList conference. (Midjourney)
One thing I love about ADPList is that they continue to support our colleagues in poor countries to get the same professional development as people in rich countries. The registration fee for this event is ridiculously cheap compared to any other conference I know.
It’s a few cups of coffee if you’re rich, but if you’re poor, it’s still affordable.
I used to encourage people to spend their “year-end money” to buy training in December, but this AI conference reg can be funded from petty cash in most companies.
AI Format Remixing: Text to Discussion and Video Presentation
One of the advanced uses of AI I encourage you to embrace is the use of AI for remixing of content into alternate media types. As an example, I took last week’s article Design Leaders Should Go “Founder Mode” and repurposed the ideas in that article into two new formats:
A 2-minute video presentation, where I used an AI-generated avatar as the presenter.
A 5-minute bantering podcast discussion, made with NotebookLM.
For the video presentation, I generated the presenter with the new avatar service Humva. I was not happy with the avatar’s voice. The tool has options for alternate emotions, such as “excited,” but I found them too exaggerated, so I stuck with the default.
While Humva has its downsides, usability is high: you pick an avatar from a scrolling list of pre-designed options and upload the manuscript. One click (and half an hour’s wait!) produces the video. This is much easier than other AI avatar services with more features and capabilities, such as HeyGen.
Humva’s avatar doesn’t sound as good as NotebookLM, but the presentation's content is a better reflection of my article and only requires 2 minutes of viewers’ time, as opposed to the 5 minutes appropriated by the bantering podcast. (Humva)
For the podcast, I used NotebookLM for the audio of the actual discussion of my article. I then overlaid this audio on a video track created by Kling 1.5, using image-to-video generation fed by a still image of the two podcast hosts made with Midjourney. Kling went a little overboard in an overly agitated animation of the podcast hosts.
In this case, I think NotebookLM’s version of my article was too verbose and didn’t quite focus on the key points in the article, so I prefer the avatar version for the content. However, NotebookLM’s audio quality is clearly superior: the voices sound much more natural and engaging. Similar to what you would get from professional presenters.
In contrast, Humva’s avatar doesn’t have the vocal presence of a professional TV presenter, or even the better YouTube influencers. (I know, I could have used Elevenlabs to get better vocals, but the workflow recommended by most AI creators is too convoluted for me. Usability rules.)
NotebookLM’s two irreverent AI-generated podcast hosts discussed my “Founder Mode” article in their simulated studio. (Image by Midjourney, since NotebookLM only generates the soundtrack)
With the current state of AI video, producing good lip-synch for two characters is almost impossible, so my podcast video doesn’t even try. The best workflow I’ve seen from the AI creator community requires the production of two videos (one for each host) that are lip-synched separately and then edited to form a single video, cutting between the hosts. But that doesn’t show them interacting with each other, which is one of the big points of the two-hosts format. And it’s too much work for me.
Cheatsheet With AI Prompts for User Research
Alita Joyce has published a useful cheatsheet with prompts for using AI in the various steps of the user research process: defining, planning, and analyzing a study, as well as prompts for refining the reporting of the findings. (PDF, 2 pages)
You’ll notice that she doesn’t cover the actual running of the study: facilitating sessions with study participants. I predict that AI will soon be able to help with research facilitation and will be able to run sessions on its own by about 2027 (when we get Ph.D.-level AI capabilities). However, for now, you’re on your own during the time you’re with the test users. (Though AI already helps with tasks like session transcripts.)
Alita Joyce is research lead for Gemini in Google Cloud, so she works with cutting-edge AI products on a daily basis, making her prompt library a particularly valued resource. (I should disclose that Alita has been a mentee of mine since I discovered her incredible talent 6 years ago. She worked with me for several years back when I was a UX consultant, and though I was a little sad when she left for bigger things in product design, this was clearly the correct decision since old-school UX consulting no longer has much of a future now that the topic has been commodified.)
The cheatsheet is based on Alita’s presentation at the recent UserTesting THiS conference (terrible capitalization). She started her talk with an illuminating anecdote: The morning of the talk, she got a request from her stakeholders who needed immediate information about a particular design decision. (Have you seen this movie before?)
In the past, such urgent requests would have been answered based on the UX expert’s gut feelings, since there’s no time to conduct research. Now, Alita turned to her insights repository with a hundred pages’ worth of findings from her past research and had AI summarize the observations from many past research studies that related to the new request. Presto, a response she estimated would have taken 4 hours of work in the past, was done in an hour with Gemini.
(A 4x productivity improvement is often the difference between doing and not doing something. Particularly in usability, slow insights mean no product impact.)
AI identifies the exact two pages you need to answer a new request, pulling from piles of past user research findings. (Midjourney)
Research Report Renaissance
In last week’s newsletter, I reported André Fangueiro’s list of 8 key uses of AI in UX work. One of his main recommendations was also to use AI to retrieve information from a research repository to answer new questions. Centralizing and analyzing past user interviews and research prevents redundant research and leverages existing insights, improving the ROI from user research. More bang for the buck = more research will be done.
While the details of the two case studies differ, the conclusion is the same: AI vastly increases the value of keeping a repository of past user research.
During my 41 years in the UX field, the pendulum has swung back and forth on the value of writing usability reports. In the early days, research was expensive and rare, so it was mandatory to document everything in an extensive report. Later, research thankfully became cheaper and more common, and we realized that stakeholders don’t have time to read 100-page usability reports. Best practices changed to emphasize short “executive summary” style reports, or even PowerPoint decks with very little analysis other than callouts on screenshots of the design that was tested.
Fat reports weren’t missed much because most people never read them. But the death of in-depth reporting also meant the death of research findings that went beyond actionable recommendations to “change this design element on this screen.” Any such insights weren’t documented (or only presented as bullet points), so they were soon forgotten and not available to influence future design decisions.
Baby and bath water, in other words. (I’m not going to illustrate this saying, because I don’t think Midjourney’s censorship AI will like a prompt specifying what’s to be done with that baby.)
As often happens, the pendulum swings back, and cosmic alignment resumes. AI now increases the value of keeping records of past research findings, because we can now find the specific points we need for new questions within the mass of old documents.
Usability reports are king again, because AI gives old research a voice to speak to new design questions: Document all your user research findings and keep the reports in a research repository. (Ideogram)
Worst UI Ever
Periodically, the same joke surfaces: design the worst possible user interface for a common task, such as adjusting the volume on your computer speakers. Apparently, somebody ran this contest again, and the 10 “winning designs” are hilariously bad. It actually takes a good designer to deliberately make something extra-bad, because you have to know why certain actions are easy and others are hard, and pick the hardest ones in a creative way.
You have to see the animation (at the above link) to really appreciate it, but my favorite is a design for entering phone numbers that resembles the classic game Tetris: a phone number field moves back and forth (left to right) at the bottom of the screen, as individual numerals drop from the top of the screen, and the user has to time the drops just right so that each numeral in his or her phone number lands in the correct slot in the phone number field.
You may think it’s a clown’s job to design user interfaces that are deliberately terrible. But doing a great job of bad design takes insight. You can also use anti-design more seriously as an ideation activity where you explore what design directions would cause your product to go bad. (Midjourney)
Flexible Displays
In the excitement about the AI revolution, it’s easy to forget older UI issues, but advances continue to occur. Most recently, Korean hardware company LG announced a flexible display that can both bend and grow by up to 50% after being stretched.
One use case LG featured in a demo is to sew the display onto a firefighter’s uniform, allowing real-time information to be displayed to the rest of the firefighters without them having to look at anything else, which could easily be deadly in a though situation.
Screens everywhere is a likely outcome of such super-flexible displays.
Stretchable screens open many new opportunities. (Ideogram)
Scaling Law for Physical AI
We know well the AI scaling law for traditional software-focused AI, such as large language models and image/video generation: the larger the model and the larger the training compute cluster, the better the AI will perform.
It turns out that there is a similar scaling law for AI that targets the physical world, for example by operating a robot or by simulating physics in a video game. A recent paper titled “Scaling Laws for Pre-training Agents and World Models” by Tim Pearce and colleagues from Microsoft Research reviews the evidence.
Every time we add an order of magnitude in compute (i.e., multiply the number of training FLOPs by 10) we gain a level of AI performance. For how long can this scaling persist? How good can robots get? We don’t know, but it seems realistic to expect several more generations of performance improvements, until robots at least equal humans at physical tasks and likely surpass us substantially.
A scaling law applies to AI that targets the physical world, such as robots: the bigger the model and the more training compute, the better the robot performs. This doesn’t mean that the actual robot needs to be bigger, just that the AI that operates it is bigger, but a larger robot is the best I could draw to symbolize this finding. (Midjourney)
This is besides the purely physical advantages robots already have over humans: they can be much stronger, and they can work tirelessly, with only an hour or two off per 24-hour period for battery charging. Because robots aren’t living beings, we also won’t mind much if they get damaged or even destroyed, which does wonders for military applications or operating in dangerous jobs like mining.
Working in the coal mine. Worldwide an estimated 15,000 miners are killed on the job every year. This is a job that ought to be done by robots where we don’t care if we lose a few. (Midjourney)
Podcast Summarizing Keynote
If you don’t have the patience to watch the entire 59-minute video of my keynote at last month’s Y Oslo conference, I made a 9-minute podcast overview, using NotebookLM.
I have been impressed with NotebookLM in the past: for example I liked the way it summarized 9 of my articles about AI creativity or my theory of the 4 metaphors for working with AI.
This time, though, I was disappointed. For one, I used the new “Customize” feature in NotebookLM to request a 5-minute overview of the keynote, but it generated a 10-minute discussion. (I deleted the last minute, which I thought was less interesting, from the published version.) Worse, I don’t think it summarized my lecture nearly as well as it summarized my written articles.
Maybe the problem is that a keynote session is targeted at a live audience and contains many asides and jokes. This could make it harder for the AI to grasp the key message, even as I hope my presentation style makes a full hour of me palatable for the audience.
What do you think? Was this podcast summary a good way of compressing an hour-long keynote? Let me know in the comments.
The two AI-generated podcast hosts from NotebookLM are happy to discuss anything you feed them, but I feel that they do better with written source materials than when asked to summarize a live presentation. (Midjourney)
Le Chat: Mistral Scales Up
The French frontier model Mistral has released an upgrade. Pixtral Large now has 124 B parameters, which is still small compared with the 405 B weights used for Meta’s Llama 3. Despite the smaller size, Pixtral has achieved impressive scores on several AI benchmarks.
In any case, I am happy to see continued advances in global AI, because the world’s future cannot be restricted to the perspectives offered by a single country’s AI labs. The model is currently offered for free at “Le Chat.”
As an experiment, I asked Mistral to explain Jakob’s Law. Here’s what it said, which I think is quite good:
“Jakob's Law of the Internet User Experience, formulated by Jakob Nielsen, states that users spend most of their time on other sites, and thus prefer your site to work similarly to those they already know. This principle underscores the importance of consistency and familiarity in web design. By adhering to established conventions, designers can create interfaces that are intuitive and easy to navigate, reducing the learning curve for users. For instance, placing the navigation menu at the top of the page and using a search bar in the upper-right corner are common practices that enhance usability by meeting user expectations.”
A well-implemented example of Jakob's Law is the use of standard icons, such as a magnifying glass for search or a shopping cart for e-commerce checkout. These icons are universally recognized and immediately convey their function, making the user experience seamless. Conversely, a violation of Jakob's Law might involve a website that places the search bar at the bottom of the page or uses non-standard icons for common actions. This deviation from norms can confuse users, leading to increased cognitive load and reduced usability, as users must spend extra time and effort to understand and navigate the site.”
Mistral now also offers image generation, based on the Flux image model:
Improved French AI. (Mistral Le Chat)
Happy Thanksgiving
Happy Thanksgiving from UX Tigers. (Midjourney)