Generative AI tools like Midjourney, DALL-E, or Adobe’s new Firefly suite are the biggest thing to happen in design since the personal computer. But the notion that users can generate creative masterpieces with a few keystrokes is misleading. The truth? With a few keystrokes, users can create highly detailed yet average images. Why? Because the large language models that power these tools were never built to be creative, per se. Instead, they attempt to produce the most likely (read: average) result based on their training data.
The result is that AI excels at creating images that have superficial appeal, yet ring hollow. These generations can feel almost too perfect, untouched by the idiosyncrasies of a specific time, place or artist. And while it isn’t immediately obvious, this tendency toward average is the most significant boundary creatives must overcome to unlock deeper creativity with AI.
Prompt: photograph of a family at the beach
Garbage in, garbage out. Like their human counterparts, AI models need inspiration to create compelling work. Before hacking away at a prompt, do some research and find appropriate references to use in prompts. This could be an art movement, a city, an era, or even a kind of camera.
Prompt: medium format black and white photograph of New York city in the 1920s featuring art deco skyscrapers :: inspired by Metropolis, Blade Runner --ar 16:9
One of the biggest challenges with image generation is manifesting what’s in your mind’s eye. There’s no easy solution to this problem, but a good place to start is by building your prompt through many iterations and changing one or two words at a time. This unlocks a new level of control and reveals the subtler connections between your prompt and the image.
Prompt: film photo of futuristically dressed DJ wearing headphones and using her turntables at a rave --ar 5:8
LLMs tend to be very literal in how they interpret language. As you write, consider that any word will be interpreted visually, even if it’s part of an idiom. So if you don’t want to see it, don’t say it. It also pays to learn model-specific syntax rules and parameters (special terms that control aspects of a generation).
Prompt: 3d render of a robot who is feeling sick and nauseated
It’s counterintuitive, but good prompts are both literally accurate and poetically expressive. LLMs are broadly aware of cultural ideas and vocabulary. Choosing a word like ‘cinematic’ will dip into a richer part of the neural network than ‘pretty’.
Prompt: impressionist landscape painting of the towering Swiss alps rising above the plains on a foggy morning
A picture is worth a thousand words. This is true for prompts too. An image prompt that has the right ‘bones’ combined with a written prompt that carries the right concept can get a generation much closer to your desired result and save hours of frustration.
Prompt: Sci-Fi concept art illustration of spherical lens levitating in vast desert landscape :: inspired by sci fi concept art, Dune --ar 5:8 --iw .4
An image prompt can come from anywhere, including from AI. As you work toward a concept, save generations that have traits that you like, then feed multiple ‘parents’ back into the AI as image prompts to get the best traits of both. This technique, which I call image breeding, gradually moves generations toward more novel results.
Prompt: concept art of a spherical lens levitating in a lush valley surrounded by mountains --ar 16:9
Design pros now have access to AI-powered editing tools like Generative Fill in Adobe Photoshop Beta. These fill a critical gap by allowing for highly specific changes, e.g. removing errors from or making nuanced changes to image generations.
Edited using Photoshop Beta using both AI and conventional tools
Ultimately, being more creative with AI isn’t about one tip or trick. It’s about building fluency with multiple approaches and tools to find a flow state in which AI plays a part.
While a lot of creative voices are either heralding AI or slamming it, the ultimate truth is that until we reach artificial general intelligence, these tools are powerful but far from autonomous in their creativity. And for someone who loves to create, I think that’s a good thing.
---
Aaron Tovi is a branding-focused creative director with a passion for learning and strategy. He’s currently the branding and design director at Pavone Group.