Summary: Top 50 AI consumer services | Ideogram launches upgraded version 2.0 | Midjourney’s web user interface finally available to all users | AI shows huge promise for routine programming at Amazon
UX Roundup for August 24, 2024. (Ideogram) So much hot news this week that I’m publishing an extra newsletter to catch up.
Top 50 AI Consumer Services
Olivia Moore, one of the most insightful analysts covering the AI space, has published an updated list of the top 50 AI services that get the most use by consumers on the web. She also has a list of the 50 AI mobile apps with the most users.
One of the top-10 AI services was new to me: JanitorAI. According to Perplexity, this is a service for data cleaning and preprocessing, including website scraping, where it can clean and classify scraped data before analysis. The name makes sense for this service. However, when visiting the JanitorAI site on August 21 it seemed to be a set of AI companions, so they must have made a hard pivot. (And Perplexity was reporting old data.) This new JanitorAI service is thus similar to CharacterAI, and its competitive edge is reportedly so-called NSFW conversations. JanitorAI’s current slogan is “Wow such bots.” (The images on the homepage are not naughty, so feel free to click the link, but enter the chats at your own peril.)
Other than JanitorAI, the top AI services are old favorites:
ChatGPT: the leading large-language model, which is also the most-used AI tool for UX professionals.
CharacterAI: an “AI companion” service famous for virtual girlfriends and virtual boyfriends.
Perplexity: an answer engine that’s fast replacing traditional Google-style search for many users. AI is faster than search at getting users what they want.
Claude: the top ChatGPT competitor.
Suno: Song-creation, which I used to make the soundtrack for my music video on future-proofing education in the age of AI.
JanitorAI: another AI companion.
QuilBot: tool to improve your writing.
Poe: an aggregation site that offers one-stop (and one-payment) access to all the leading AI tools. I use this a lot.
Liner: AI-generated summaries of documents, including a PDF highlighter for collecting important information. The tool aims to help users digest large amounts of information quickly and gain deeper insights into various topics.
Civitai: AI image generation.
As of mid-2024, ChatGPT retains the gold medal as the AI tool consumers use the most. The “AI companion” service CharacterAI takes the silver, and search/answer engine Perplexity wins bronze. (Ideogram)
I find it interesting that Suno is number 5, whereas Udio (another song-generator) is only number 33. Assuming the normal Zipf distribution of popularity, this means that Suno might have about 7 times as many users as Udio, even though I think the two services make equally good songs. (I used Udio to make the soundtrack for my music video about dark design.)
It’s still early in the game, so the hitlist positions of Suno and Udio will likely equalize somewhat over the next year, assuming that both services upgrade their music quality at about the same pace. However, Suno’s current lead does indicate a bit of first-mover advantage. Suno launched in December 2023 (I made my first Suno song in January 2024), and Udio launched in April 2024.
Olivia Moore points out that it’s remarkable that 52% of the companies on the top-50 list are focused on content generation or editing, across modalities — image, video, music, speech, and more. Half a year most of the content generation tools were image generators, and services like Civitai, Leonardo, Midjourney, and Ideogram still score well. But the list now has more companies focusing on other media forms, and image-generation only accounts for 41% of the top creative services. Video is particularly big, as I have noted several times in this newsletter. (A new video service named Hotshot launched this week and will be covered in Monday’s newsletter.)
Two of my articles on this trend:
I’m excited to see that creativity is dominant in regular people’s use of AI. (Creativity is also big in corporate AI use, but enterprise use cases for AI are much broader.)
We now have a huge number of different AI services and more launch every day. It’s exciting to see how different these AI offerings are. (Ideogram)
Olivia Moore notes that the Chinese company Bytedance (best known for Tiktok) now has three entries on the top-50 website list and three entries on the top-50 apps list. (One Bytedance AI service, Doubao, appears on both lists. The English version of Doubao, Cici, is also on the list. Doubao and Cici are general-purpose AI assistants.)
Moore’s final trend is that three of the new hitlist entries are from a new category: AI dating advice. Maybe there’s still hope for human contacts and the future won’t be all AI companions. With two of the new services (LooksMax and Umax), you upload photos of yourself, which the AI will rate and then give you advice on how to look better. The third service, RIZZ, makes users upload a conversation from an external dating app, after which it’ll give you advice on what to say next.
Mirror, mirror on the wall, do you have any advice for my hairstyle? (Ideogram)
Ideogram 2.0
Ideogram has long been the AI image tool with the best prompt adherence and the best text rendering. On August 21, they launched version 2.0, with an emphasis on even better text rendering and also better image quality.
In the announcement of the new release, Ideogram reports on a comparative study of their new version relative to the new Flux Pro image model and the venerable Dall-E 3. No surprise that Ideogram wins big in text rendering. I think Ideogram 1 would have won, relative to these two image models, though Flux Pro is probably close to the old Ideogram.
Ideogram also wins in prompt adherence and in “overall preference.” I assume that the latter was measured by showing pairs of images to study participants and asking them which one they liked best. But it would have been nice to see a bit more of the methodology revealed. Ideogram does not show a comparative measurement study of their old v.1 vs. the new v.2, which would have been a good way to quantify the progress. They also don’t show a comparison with Midjourney. There’s no doubt that Ideogram (whether v.1 or v.2) beats Midjourney big time on prompt adherence and text rendering. But I still think that Midjourney wins in image quality, and Ideogram might not have liked to show a study with that outcome.
I challenge independent researchers to conduct a study of human image preferences between the major AI image-generation models. This could even be a good project for an undergraduate student, since it’s not that methodologically difficult. The basic metrics are easy to collect. However, eliciting qualitative insights into why people prefer one image model over another would be harder and probably be a job for a graduate student.
To test the image quality of Ideogram 2, I reused the same prompt I have been using to test Midjourney’s progress: A group of 3 K-Pop idols dancing on stage in a TV production. (Compare the images I made with this prompt in Midjourney v. 6.0 and 6.1 three weeks ago.)
Photorealistic image of the 3-member K-pop group, rendered by Ideogram 2.0.
This image is decent but not nearly as good as Midjourney’s, particularly in terms of the symmetry of the 3 dancers. Ideogram does win on prompt adherence, because 95% of the images it generated (I’m only showing you the best) featured exactly 3 dancers, as specified in the prompt. (One of the 20 images I generated for these 5 tests featured 4 performers.) In contrast, Midjourney often included a different number of idols than the requested 3.
I also made the K-pop image as a color drawing. I think it did better in this style: 3 of the 4 drawings were great, and I had difficulty choosing which image to show you.
Color drawing of the 3-member K-pop group, rendered by Ideogram 2.0. (One wonders how it got the idea for a black and pink color scheme.) This is a better drawing than what Midjourney has been giving me, though Midjourney wins for the photorealistic images.
One of the new features of Ideogram 2.0 is that it offers a small number of predefined styles that will help users overcome the articulation barrier by avoiding the need for detailed stylistic keywords. Here, I tried the styles “Anime” and “3D.”
The 3-member K-pop group rendered with the built-in “Anime” style of Ideogram 2.0. This is not my favorite drawing style, but I know many people are big anime fans.
Here’s the “3D” style that’s also built into Ideogram 2.0. I guess it has a little more dimensionality to the image than the “photorealistic” version shown above. Dance synchronization in this version is good, if not perfect (compare the bent leg of the center dancer vs. the rightmost dancer), except for the finger positions. The idols’ faces are rather scary.
For a final experiment with Ideogram 2.0, I avoided the built-in “3D” style and instead prompted for a “3D animated movie” look.
3D animation version of the K-pop group, rendered with Ideogram 2.0. Finger positions are still wrong, but now the plasticky faces are not scary, since they are clearly not intended to be realistic. The uncanny valley will remain an issue for maybe another year.
Ideogram now allows users to quickly experiment with a small number of different styles, which will be helpful. Many more styles are available, but require prompting skills.
Another new v.2 feature is color palettes, both predefined and user-defined. The latter will be of interest to people generating images along brand guidelines.
The following two images were generated from the same prompt, but specifying two different built-in color palettes: “pastel” and “melon.” I have overlaid a copy of each palette on its image so that you can see those two color schemes.
AI painters made with two different color palettes. Same prompt. (Ideogram)
See also: 6-minute video with the founder of Ideogram explaining his approach to designing the product. I like his point that they are driven by millions of images created by their users. This huge data set clearly shows what users want and how they currently write prompts. However, because of the task-artifact cycle, users’ needs (and promoting techniques) will likely change with more capable AI.
(All the images in this edition of my newsletter were made with Ideogram 2.0, except for the hero images with flaming letters which I had already made with Ideogram 1.0.)
Midjourney for All
Midjourney is the best AI image-generator in terms of image beauty and support for an incredibly wide range of styles. Unfortunately, it used to require a rite of passage where new users were forced to start out using its UI on Discord which has terrible usability. Users were only allowed to migrate to Midjourney’s much better web UI after they had generated many images on Discord.
This onboarding flow is the opposite of what you want. You want to start novice users out with a scaled-back simple UI with maximum usability and then gradually expose them to advanced features. But Midjourney required users to start with the complex UI and subsequently move to the easier UI.
Now, Midjourney has finally come to its senses and allows new users to register on its website and start using the web UI immediately. To celebrate this broader appeal to new users, Midjourney also has a short-term promotion for a free trial period. So if you’ve ever wanted to try Midjourney, now’s the time.
Midjourney’s user experience for newbie users, before vs. after. (My apologies to Midjourney for using Ideogram to make this illustration, but I had decided to use Ideogram’s new version 2 for all the images in this newsletter edition.)
AI Shows Huge Promise for Routine Programming at Amazon
Andy Jassy (CEO of Amazon) commented on Amazon’s use of AI in software development projects. He noted that the use of AI reduced the average time to upgrade an application to Java 17 from what’s typically 50 developer-days to just a few hours, or a factor of about 100x productivity gain. This one AI use case has saved Amazon 4,500 developer-years of work. Jassy doesn’t say how much Amazon pays its programmers, but if we assume that each programmer has a loaded cost of $200K per year, this equates to $900M, or almost a billion dollars. He also estimates that the revised code led to $260M in annualized efficiency gains, early pushing the value of this one AI project past the billion-dollar mark.
(Note that whenever you want to estimate the business value of improved productivity — whether from AI or usability — you should use the fully loaded cost of an employee. The number on the paycheck is irrelevant. You must add overhead costs such as benefits, employment taxes and fees, office space, and even the cost of recruiting that person in the first place.)
Human oversight is still needed, but Jassy notes that 79% of the AI-generated code shipped without any changes.
We have long known that software development is the field where current-level AI helps human professionals the most. Thus, we should not expect 100x gains when using AI for non-programming tasks. We should also not expect 100x gains for non-routine programming tasks. On the other hand, Jakob’s First Law of AI states: Today’s AI is the worst we’ll ever have. Big upgrades are expected in less than a year with the next generation of AI capabilities.
Bottom line: any business strategist should think about what tasks become profitable with huge AI-driven productivity gains. In this example, many companies would have been tempted to leave legacy code in place instead of upgrading to a modern programming language. Would Amazon really have spent $900M doing this conversion manually? But with AI, the business case is clear.
Each human programmer gains a huge productivity multiplier by using AI for routine programming tasks. (Ideogram)