How small and midsize businesses can take advantage of text-to-image AI
Having the chance to try DALL-E 2, the new AI system from OpenAI that can create realistic images from natural language, was pretty extraordinary.
There’s no question the system is still in its infancy, but it’s clear the technology is moving quickly and we’re already starting to see improved text-to-image models. Google Brain’s Imagen, which can generate photorealistic images of a scene given a textual description and Meta’s Make-A–Scene, which allows users to draw a freeform digital sketch to accompany a text prompt, are both promising examples.
This technology isn’t just for large tech companies, either. There are a number of ways small and midsize businesses can take advantage of text-to-image technology today.
Marketing campaigns with tools like DALL-E 2
On average, it’s recommended that small businesses spend 7–8% of their gross revenue on marketing. And yet, many are only spending 3–5%. Tools like DALL-E 2 can let entrepreneurs punch above their weight even if they don’t have the luxury of hiring talent to produce customizable, branded graphics.
Join today’s leading executives at the Low-Code/No-Code Summit virtually on November 9. Register for your free pass today
As the saying goes, a picture is worth a thousand words. As DALL-E 2 and other text-to-image systems become more widely available, it makes sense that marketers will start using them for their campaigns.
Take, for example, a brand that sells fresh seafood. While it makes sense to include enticing imagery in its emails, there’s no real advantage to doing its own photo shoots over using stock photos. Subscriptions to stock photo catalogs for commercial use can easily cost hundreds of dollars a month, which is hard for smaller brands to justify.
Enter DALL-E 2. The model lets marketers combine the creativity of hiring an in-house graphic designer with potential price savings. Here, for example, is what DALL-E 2 produces when asking for “a photo depicting a mouth-watering salmon with lemon slices.”
While these images are indeed mouth-watering, what happens when the supply of salmon unexpectedly dips, and our offering this week shifts from salmon to cod? For small businesses that need to pivot at a moment’s notice, the ability to generate new images in 30 seconds is priceless.
Graphic creation by DALL-E 2
DALL-E 2 entered the popular consciousness not because of its ability to faithfully mirror the world, but because of its ability to create wonderfully stylized images.
Consider this watercolor of a panda wearing a hat. The image is playful, and well-structured, and does not exist in any stock photo catalog. This is why DALL-E 2 is exciting — it opens new opportunities, especially for small businesses.
The success of tools like DALL-E 2 may be daunting for designers out there worried about AI automating their roles, especially since some experts predict 99.9% of online content will be AI-generated by 2030. But we believe DALL-E 2 won’t replace any jobs; it will instead become a part of a marketer’s toolkit and skillset.
The output quality of DALL-E 2 varies dramatically depending on what prompts are fed into it. Generating a good prompt is a very creative process, similar to writing copy itself. If the future includes marketers in hybrid roles using these tools for inspiration, then this will increasingly become a domain skill.
DALL-E 2: Inpainting
One of the most exciting features that DALL-E 2 offers is automatic inpainting. Think of it as having a magic wand that marketers can wave to change images however they want.
This is best seen through example. Here, DALL-E 2 was given the first image and told to insert a corgi in a specific location. The model understood the context of the image well enough to understand that it was being asked to render the dog inside a painting and was able to compose in a style that matched.
This fascinating feature holds much value, especially for brands who want to capture their products in different locations or unique scenarios. Imagine placing your product in backdrops you can’t visit or seeing how George Washington would have looked while sipping a Coca-Cola.
DALL-E 2 can remarkably change the game for marketers to where they’ll be more limited by their creativity than by time and budget.
While there is reason to be excited about the future, the current technology still has limitations. It struggles to generate photorealistic people and often fails to generate coherent text. Moreover, there have been some significant criticisms of DALL-E 2 on the issue of bias. For example, when tasked with generating a “portrait of a smart person” DALL-E Mini – a lower-cost, open-source implementation of DALL-E 2 — generated nine pictures of white men in formal attire.
OpenAI recently released an update aimed at mitigating some of these issues of bias, but it remains to be seen if they are truly solved. In the meantime, marketers using these tools will have to be careful to avoid unintentionally propagating biases.
So would we want DALL-E 2, in its current form, as a tool in our marketing toolbox? Absolutely, and for many purposes: images that can be used as generated, to give starting points for further editing; to help brainstorm; to convey ideas to graphic artists; to run efficient A/B tests on creative, and to create fantastical eye-catching images that today could be done only by artists. Despite its limitations, there are also opportunities.
Over the past six months, we’ve seen growing interest in breakthrough text-to-image technology. As technology continues to move quickly, better, more improved models will continue to emerge, and the use cases for brands will only grow.
Robert Huselid and Tom Dinitz are data scientists at Klaviyo, a unified customer platform for email, SMS, and more, that empowers online brands to own their data and grow on their own terms.