Summary: Lessons from LEGO’s Creative Play Lab | Mockup covers for UX book | Revisiting thinking about future trends | What new products and services can be built with 10x or 100x less investment needed because of AI? | Role-based prompting may not work | Mistral Large 2 is a small but powerful new AI model | Kling AI video | Great new AI-animated movie | Design students have highest GPA in Denmark

UX Roundup for July 29, 2024. (Leonardo)

LEGO Creative Play Lab

LEGO’s ideation machine (Midjourney)

Inc. magazine has an exciting look behind the scenes at LEGO’s Creative Play Lab where 250 staff are trying to figure out the future of play for the next 10 years. The reporter, Jason Aten, synthesized what he saw and heard at the Lab into 5 themes:

Simple matters most. You can just pick up a LEGO brick and start playing, and they want to retain this simplicity. (Or usability, as I might say.) Every new LEGO set will work with every LEGO brick sold since 1958. That type of compatibility should be a lesson to the tech industry, which routinely breaks users’ installed base. In my time scales for UX, I listed honoring user data as a multi-decade commitment, and LEGO does that.
Ideas Come from Everywhere, not just those 250 people in the Lab. They do have strategic thinkers working top-down on big trends, but innovation comes just as much from a bottom-up approach of having lots of people in other divisions play around and see what they come up with.
Organized for Creativity. Besides dreaming up ideas for toys (“play discovery”), they use a matrix org chart which includes engineering (build it!), “play propositions” (how to make an idea profitable), and “play introductions” (how to take the new idea to market). Many UX people forget anything beyond design, or at most give a passing thought to their developer colleagues. But profitability and marketability are what make or break a design.
Expect to Be Wrong. They don’t proceed too far with an idea without trying to poke holes in it to see where it’s weak. Often, playtesting with children will reveal that kids do something completely different with a toy than what the designers expected.
Beta Test Everything. I’m repeating the subhead from the Inc. article, but I think it’s wrong. It seems that the LEGO Lab tests a range of design fidelity, from early prototype testing to actual beta testing, where about a thousand children play with almost completed products in their own homes.

How to find out whether an idea works? Playtesting with kids. (Leonardo)

LEGO clearly does many things right: it’s currently the world’s largest toy company. Even if you don’t build toys, you can learn from their approach to innovation.

This could have been me 60 years ago. If I had saved my bricks from then, they would still work with LEGO’s newest products. (Midjourney)

Mockup Book Covers

One way to make an idea or project seem more real is to make a mockup book about it. Here are 4 mockups I made in about a minute with Leonardo’s Phoenix model:

Note the different feel we get from the thin book vs. the brick-sized book vs. the book in space vs. the book with a cute robot. I didn’t prompt for any of these specifics, I just gave Leonardo the book’s supposed title and author.

To dispel rumors: I’m not actually writing this book. I just picked a topic that could have become a book to show you the effect of physicality on abstract concepts.

Leonardo’s new Phoenix model is great at creating mockup products. (See also the mockup movie posters I created recently — also a neat idea for team meetings or project launches.)

Revisiting Future Predictions

Marc Andreessen is one of the most insightful people regarding the future impact of AI. He originally gained fame as the designer of the first major graphical user interface to the Web, though his current fame is mainly based on his status as cofounder of the major VC firm A16Z (he’s the “A” in that acronym.) I frequently cite A16Z in my articles, because his colleagues are some of the most insightful current commentators. A good example is the discussion between Andreessen and Ben Horowitz (the other A16Z cofounder) on AI Boom vs. Internet Boom.

When futurists gaze into their crystal ball, they often give us visions of flying cars and space colonies. But what about the predictions from the guy who invented the GUI web browser? (Leonardo)

I recently came across the video of an interview with Marc Andreessen recorded 7 years ago. Rewatching this video is a great way to assess the extent to which his past predictions have held up and, thus, how credible he may be today.

(If you want to conduct the same exercise with me, watch the invited speech I gave at Google in 2013 or my acceptance speech when I was awarded the ACM SIGCHI Lifetime Achievement Award for Human-Computer Interaction Practice that same year.

Unfortunately, I don’t think there are any videos of my old usability talks from the 1980s or my Internet presentations at Internet World during the dot-com bubble.)

I think Marc Andreessen’s 2016 talk has held up very well. Among other things, he rejects the notion advocated by some other analysts at the time that most important things had already been invented. (To be honest, I had a similar feeling about UX during the period of 2013-2022. I was happy that I had won the big battles of the dot-com bubble era and that UX had become accepted, but sad that a series of minor mop-up operations was all that was left — all the big points were settled. UX was no longer fun during that decade. Of course, come 2023, the AI techquake shifted the ground under UX so drastically that we now must rethink almost everything we thought we knew.)

Back in 2016, Andreessen offered a take on the “AI will eliminate all jobs” scare that the fearmongers were promoting even then. His example was that self-driving cars and trucks are likely to replace about 5 million jobs of truck drivers and other drivers. Losing 5M jobs sounds bad when compared with the news of the day that about a quarter million new jobs are typically created monthly in the United States.

Thinking that AI will cause mass unemployment is fearmongering nonsense that was wrong in 2016 and is wrong now. Better to focus your mind on how to profit from AI. (Leonardo)

However, he points out that the media reports on the change in net jobs in the economy. The underlying changes in the economy paint another picture. What actually happens is that 21 million jobs were destroyed in the United States and that 24 million new jobs were created. The net change is 3 M additional employment, but the actual opportunity for those 21 M people who find themselves unemployed is to get the 24 M new jobs.

Given these numbers, even if all truckers and other drivers were fired on the same day, it would only take 2.5 months for the economy to create new jobs to accommodate them all. (Actually, it would probably take less time, because the scenario where AI has become good enough to eliminate all driving jobs would imply that any physical movement of goods was so efficient that many more new jobs than normal would spring up in many industry sectors, including manufacturing, retail, supply chain, and so forth.)

A balanced evolution of the economy will require contributions from both AI and humans. (Ideogram)

AI Enables Completely New Services

Aaron Levie (CEO at Box) posted a good analysis of the limitations of thinking about the impact of the huge productivity gains from AI, when people focus on making current work more efficient. I must admit that I often fall into this trap myself, despite knowing better. Current AI makes most knowledge work 40% more efficient, and future AI will empower super-efficient small teams to move about 10x faster than large non-AI teams, weighed down by slower performance and stifling management and communication overhead.

For UX professionals, this means that you need to get out of any old-school company you may be stuck in now and join a super-powered product team pronto. UX is pancaking, and UX work will be done very differently in the future than it was last year.

All this is very good. We’ll work at superspeed in empowered teams. But what will we ship?

AI will unlock advances in a wide range of industries. We need to dig deep into vertical design. (Leonardo)

That’s where Levie’s analysis comes into play. He points out that if it’s, say 10x cheaper to build a product or 100x cheaper to deliver a service with AI support, many new things become economically feasible. (See the final news item in this newsletter for coverage of this month’s 33x price drop for AI.) Something that’s nice to have might not sell at a price of $1,000. But if it were only $10, many more people would buy, making the service a hit.

He presents two ways of thinking about new services that will soon become feasible with AI:

Adding many more humans to a current problem would make the solution good but overly expensive. With AI instead of people, presto, it’s cheap now.
High specialization or expertise is needed to solve a problem, but most small companies can’t afford such expensive personnel. AI expertise is cheap, and anybody can embed it in their products and services.

Think about what products or services you could build that are currently too expensive because they would require too many employees to deliver and/or too expensive experts to staff.

Imagine a scenario where you have a team of your own clones, ready to work for free, just waiting for your instructions. This is a unique perspective on the potential of next-generation AI, which could revolutionize how we accomplish tasks, making them significantly more cost-effective. (Midjourney)

Animated Logo

I experimented with Mojo, a generative AI app with only one feature: to animate your logo or product photo. Not sure how useful this is, but usability is good because the app only does that single thing. You can see two logo-reveal videos on my Instagram, making my UX Tigers logo much more dramatic than it normally is.

I used this animation recently as the hero image for one of my newsletters on LinkedIn, and sure enough, it received about 3x the impressions I usually get when posting a still image.

Capturing eyeballs on social media through an arms race of AI-fueled animation. But this profusion of easy-to-make animations will soon become overwhelming and will probably backfire. (Midjourney)

Role-Based Prompting May Not Work

It’s long been a recommendation for prompt engineering to assign the AI model a role. For example, if asking it to create task ideas for a usability study, you could start your prompt by saying, “you are the usability expert Dr. Jakob Nielsen.”

For older AI models, telling them that they are an expert in the topic you want help on seemed to generated higher-quality results. However, for the newest models (like GPT 4o or Claude 3.5), new research indicates that role-assigning may be a waste of time. It doesn’t seem to hurt, though.

For full information, read “The Prompt Report,” a 76-page report surveying the main current prompt techniques.

Regarding role-based prompting, there’s an easier-to-digest short blog post by the “LearnPrompting” consultancy that was the lead author of this report. They tested 2K questions from MMLU (a common AI benchmark) on GPT-4-turbo in different variations that assigned 12 different roles to the AI.

Roleplaying was a popular early prompt engineering technique that may have outlived its usefulness according to recent research. (Ideogram)

The percentage of correct answers differed a little, based on the role:

You are a careless student: 63.0%
You are a mathematician: 63.1% correct.
You are an Ivy League math professor: 64.3% correct
You are a farmer: 64.4% correct

Yes, stepping up the role from a genetic mathematician to a professor at a prestigious math department did increase the score. But the AI did better yet when told to be a farmer. I think these small variations are either random noise or indicators that for some tasks a token in the prompt tweaks the AI slightly, but unpredictably.

“Random” and “unpredictably” are the key words in this conclusion. There doesn’t seem to be a systematic gain from assigning a role to the AI.

In contrast, using more advanced prompt engineering techniques did help. 2-shot chain-of-thought prompting achieved a 70.0% score, which given 2K data points for each prompting technique is statistically significant: better than the “Ivy League professor” prompt at p<0.0002 according to the significance calculator nicely provided by MeasuringU.

LearnPrompting has a series of prompt engineering courses that seem solid, though not having taken any of these courses I can’t fully vouch for them. I believe prompt engineering has a very short-lived future, because future AI models will likely not require such tricks to perform well. (More strategic concepts of how to best use AI for your work, such as diamond prompting in UX design, likely have more of a future.)

Also, specifying the audience (as opposed to the role of the AI) is likely to continue to be useful. For example, “write a 2-paragraph explanation of Jakob’s Law of the Internet User Experience targeted at an audience of senior UX professionals” vs. that same prompt, but “targeted at an audience of high school students” should generate two rather different articles.

A farmer doing math. According to the new study, you get marginally better results from GPT if you tell it to be a farmer instead of an Ivy League university mathematics professor. (Midjourney)

As a small case study of the impact of specifying the target audience, here’s the explanation of Jakob’s Law from Claude 3.5 Sonnet, when asked to write for senior UX professionals (like most of my newsletter subscribers):

“Jakob's Law, formulated by usability expert Jakob Nielsen, posits that users spend most of their time on other websites, not yours. This principle underscores the importance of leveraging established design patterns and conventions in web design. Users develop mental models and expectations based on their cumulative experiences across various websites, which significantly influence how they interact with and understand new interfaces.

For senior UX professionals, this law has profound implications on design strategy. It suggests that while innovation is valuable, it should be balanced against users' existing mental models to avoid cognitive friction. Adhering to common design patterns can enhance usability and reduce learning curves, particularly for core functionalities. However, the challenge lies in striking the right balance between familiarity and differentiation, ensuring your design is both intuitive and distinctive in a crowded digital landscape.”

Same prompt, but now asking Claude 3.5 Sonnet to write for high school students:

“Jakob's Law states that people spend most of their time on other websites, not yours. This means that when users come to your website, they expect it to work similarly to all the other sites they're used to. Think about how frustrating it would be if every time you opened a new app on your phone, the buttons were in different places or worked in totally different ways. The same idea applies to websites.

Because of this, when you're designing a website or app, it's usually best to stick with familiar layouts and features that people already know how to use. For example, most people expect to find a website's main menu at the top of the page and a search bar in the upper right corner. By following these common patterns, you make it easier for visitors to navigate your site and find what they're looking for quickly. This doesn't mean you can't be creative, but it's important to balance uniqueness with usability.”

User overwhelmed by websites with overly different layouts, showing the consequence of violating Jakob’s Law. (Leonardo)

The Flesch-Kincaid grade reading level of the explanation targeting senior professionals is 15.4 (corresponding to somebody just about to graduate from college), whereas the explanation targeting high school students has a grade level of 9.6 (corresponding to somebody in the early years of high school). Thus, specifying the target audience made Claude 3.5 answer pretty much perfectly, at least in terms of the reading difficulty of the prose. (I actually prefer the simpler explanation, but both are correct. You can compare with my own, human-written, explanation of Jakob’s Law.)

For comparison, this newsletter (excluding those two sample explanations of Jakob’s Law) scores at an 11th-grade reading level, corresponding to the reading skills of somebody one year from graduating high school. I target a 12th-grade reading level, so I’m pleased when I score within +/- 1 grade of this goal. Even though most of my subscribers are college graduates and thus can cope with a 16th-grade reading level, I target the 12th grade for two reasons:

Readability levels indicate the supposed ability of somebody with that amount of education to easily read a text. In real life, most high school students in the United States graduate without having learned what they were supposed to, so they don’t read at a 12th-grade reading level despite having completed 12 years of school. Since 1/3 of my readers are in the United States, I must reckon with this fact.
Even people who did learn what they were supposed to in high school and college appreciate prose that is easier to read than their textbooks.

Mistral Large 2 — Small but Powerful New AI Model

Literally 4 hours after I sent out last week’s newsletter about new AI models, French AI company Mistral released Mistral Large 2, which is claimed to perform at around the same level as GPT-4o and Llama-3.1-405B.

Of most interest, Mistral is the second high-powered open-source AI, meaning we won’t depend solely on Marc Zuckerberg's benevolence for open AI. Sadly, it seems that Mistral is only open for “research” purposes, not commercial applications, which somewhat defeats the entire purpose of open source: letting a thousand flowers bloom by allowing everybody to fiddle with the model and create new variants.

A second point of interest is that Mistral is a much smaller model than Llama 3.1: 123B parameters instead of 405B. (This may be how this French company avoided trouble with the EU’s AI Act.) When reading Llama’s whitepaper, I did suspect them of being relatively inefficient — possibly because of a desire to ship fast, which is admirable. Time to optimize the code later.

Mistral is especially good at coding, supporting 80 programming languages, but only “dozens” of human languages. Why do developers get treated better than the rest of us? Probably because they’re still the only ones building AI, without much UX involvement.

Mistral performs about as well as Llama, despite being a much smaller AI model in terms of parameter count. (Leonardo)

France continues to wave the flag for top AI advancements outside the United States. Strong foundation models are also being built in the United Arab Emirates, China, and probably a few other countries. (Ideogram)

Kling AI Video

I experimented with the Chinese generative-video service Kling to make a short music video about future-proofing education in the age of AI. I created both both text-to-video and image-to-video segments and merged them into this single music video. No character consistency for the singer or her backup band of merry robots!

I don't like this as much as the music video I made for the same song with Luma Dream Machine a little over a month ago (June 14, 2024).

Text-to-video music video made with Kling. This singer and her robot band don’t exist. (Kling + Leonardo for the upscale.)

Great AI Movie

My music video linked above is honestly no good — but then I made it in half an hour that was mostly spent waiting for Kling to generate the animation segments. (I shared it to allow you to compare Kling and Luma.)

For an example of great AI animation, I recommend watching “Finding Faith” (2 min. video, Vimeo). Despite the title, this is not a religious movie, unless you take a highly metaphorical view. It’s the story of a little girl having fun with a giant robot.

This animation was made as a showcase by faith, an “AI creative agency” in London, founded as a subsidiary of the VCCP advertising agency. I applaud VCCP for starting an AI agency and exploring the creative potential of AI. (I scold them for eschewing capitalization in their name: the one lesson one should not take from frog — a great design agency in every way except capitalization.)

See also faith’s article on how they made this animation, complete with a “making of” video at the bottom of the page.

Hat tip to Justine Moore for alerting me to this animation.

The agency used 15 different tools to create the film, including major AI products like Midjourney, Runway, Udio, and Elevenlabs, plus traditional video tools like Premiere Pro.

According to the making-of video, they also used motion capture of a human actress as the base of many of the animated girl’s movements. I’m not sure I would call this “cheating,” because I have always advised using the best tool for the job — AI or not — and integrating human and AI capabilities.

Still, I would expect AI to have better motion abilities in the future, removing the need for this analog step. Also, it’s crazy to have to combine 15 different tools in a single workflow to make a 2-minute animated short. Better integration is required for future AI.

faith had a “small team” (headcount not disclosed) who created the movie in 4 weeks. Compare that budget with my 30 minutes, and you’ll see why their movie is better than mine. (Plus, agency creatives are undoubtedly almost infinitely better at movie-making than I am.) They estimate that using AI reduced costs by 85% compared with the traditional workflow for making this type of animation.

Lowering the cost of media creation by 85% is equivalent to breaking open the cage that constrained creativity in the past, when only rich brands or big studios could afford high-end animation. Now many more can afford an extended range of media to express themselves, and another one or two generations of AI will truly bring media freedom to the world, as the cost declines further. (Leonardo)

Design Is Number One (and Number Two)

For this year’s admissions to higher education in Denmark, the Copenhagen Design School had the highest GPA requirements of all undergraduate degree studies. The second-highest GPA was for the Kolding Design School. (Link to article in Danish.)

(Danish college admissions follow very strict rules, where the criterion is how high an applicant scored in high school. They have a national exam system in place to ensure that grade inflation is the same at all high schools in the country, so for all graduates within a given year, the GPA approximates their IQ and diligence equally well, no matter what high school they attended.)

Thus, Denmark's smartest and hardest-working students are now at these two design schools. I would say that’s promising for the future of my old country.

UX Roundup: LEGO Labs | Mockup UX Book | Revisiting Futurism | Cheap AI Enables New Services | Prompt Roles | Mistral 2 | Kling | Good AI Animation | Design Highest GPA

LEGO Creative Play Lab

Mockup Book Covers

Revisiting Future Predictions

AI Enables Completely New Services

Animated Logo

Role-Based Prompting May Not Work

Mistral Large 2 — Small but Powerful New AI Model

Kling AI Video

Great AI Movie

Design Is Number One (and Number Two)

Recent Posts

Top Past Articles

Design Leaders Should Go “Founder Mode”

4 Metaphors for Working with AI: Intern, Coworker, Teacher, Coach

Dark Design Patterns Catalog

UX Angst of 2023-24

Jakob’s Law of the Internet User Experience

Ideation Is Free: AI Exhibits Strong Creativity, But AI-Human Co-Creation Is Better

The 10 Usability Heuristics Reimagined

UX Needs a Sense of Urgency About AI

AI Is First New UI Paradigm in 60 Years