About one year ago, groundbreaking image-generation AI hit the stage — and they hit it in style. AI image creators, once a novel technology, have evolved into sophisticated tools and they’re already making an impact on the world.
“Even for laypeople not blessed with artistic talent and without special computing know-how and computer hardware, the new model is an effective tool that enables computers to generate images on command,” said Björn Ommer from LMU Munich, who’s research group developed the Stable Diffusion AI model.
Stable Diffusion is one of the ‘big’ image-generating AIs, along with Dall-E and Midjourney. It’s also the AI that specifically aimed at being accessible to regular users. Ommer recently spoke at the Heidelberg Laureate Forum, in a panel where he explained how important it was to his group to make the technology available to the people.
“What sets Stable Diffusion apart … was that our goal was right from the bat to make sure this technology works on consumer hardware for 300, 400 Euro … as opposed to the trajectory going to only big companies having these assets to run generative AI.”
The trained model is incredibly powerful, being capable of training with and synthesizing billions of images, while being just a few gigabytes itself. This more minimalistic approach to AI has still proven remarkably effective.
“Once such AI has really understood what constitutes a car or what characteristics are typical for an artistic style, it will have apprehended precisely these salient features and should ideally be able to create further examples, just as the students in an old master’s workshop can produce work in the same style,” explains Ommer. Indeed, AI is proficient at creating images in various styles.
Not only is this AI relatively small and easy to power, but it was also released free of charge under a CreativeML license. So unsurprisingly, plenty of different applications are already emerging.
“We are excited to see what will be built with the current models as well as to see what further works will be coming out of open, collaborative research efforts,” says doctoral researcher Robin Rombach, who also worked on the algorithm.
But Stable Diffusion is far from the only image-generating AI out there.
Mixing words and images
Meanwhile, OpenAI, the organization behind ChatGPT, integrated the wordsmith AI with its own image generator, called Dall-E.
We’ve heard all about ChatGPT and how it’s so good at various tasks involving words. Well, apparently, one of the things it’s also really good at is coming up with prompts to generate images.
In the past few months, we’ve seen AI prompt writing evolve from a pastime activity to an art to a well-paid job. Yep, you can now be a Prompt Engineer and, according to some job offers, make up to six figures a year. It makes some sense; after all, you can tell AI to draw you a photo of an otter or a flying elephant, but if you want something specific (say, something you’d use in an ad or in a promotion), you need to be specific. Detailed prompts of entire paragraphs featuring intricate descriptions were used. But OpenAI had a different idea.
OpenAI thought “What better way is there to get into the mind of an AI than by using another AI?” So, Dall-E is now integrated into ChatGPT. You can use ChatGPT to create prompts. Instead of trying to master “prompt engineering”, you can use AI to produce better prompts. You may have seen us at ZME Science experiment with this as well (here or here). I can confirm, you really don’t need a lot of experience to get decent results.
So, we’re at a stage where it’s become not only possible to use AI to create images, but it’s become increasingly easy.
The impact of image-making AIs: less jobs, more misinformation
The first impact of image-creating AI is that, well, we have an influx of elaborate images. We’re not talking about simple illustrations — AI has even won art competitions. In fact, if you’ve seen more than a few impressive images lately, there’s a good chance one of them was made by AI.
But that’s just the tip of the iceberg. The impact of AI in image creation extends beyond visual art. Its applications in education, advertising, and even mental health (through art therapy) indicate that this technology could reshape numerous aspects of our lives. But not all the impact is positive.
For starters, AIs have had a big impact on jobs — and on white-collared jobs, to be more specific. Jobs once thought to be safe from AI, like creative industries, are under assault, and it’s only been a few months. We’ve been hearing the ‘AI is taking our jobs‘ bit for a few years but now, it seems like it’s finally coming true.
It’s not all gloomy, though. Many creators have now incorporated AI into their workflow. The role of AI in the creative process is evolving and rather than replacing human artists, AI is increasingly seen as a collaborator, expanding the horizons of what’s possible in art and design. But there is little doubt that some jobs are starting to disappear.
Then, there’s the misinformation.
AI can already create portrait images that are more convincing than real photos. Yes, there have been studies done on this; yes, people get fooled by AI (yes, you can be fooled as well). So, it’s never been easier to create fake profiles and get away with it. AI is already being used for propaganda, precisely because you can create images that fool people. Sometimes, these images feature famous people; other times, they don’t. Despite some guardrails from companies, AI is already proving a powerful tool in propaganda.
The matter of copyright and intellectual property is also unclear. Several high-profile lawsuits are currently underway, with artists and photography companies suing AI for what they claim is an unlawful use of their art. For instance, one lawsuit resulted in the U.S. Copyright Office revoking the copyright covering the images in a graphic novel and issuing a new copyright covering only the text of the book, according to a report in Reuters.
A watershed moment
As AI continues to blur the lines between human and machine creativity, it raises questions about originality, intellectual property, and the future of artistic expression. The balance between leveraging AI for positive advancements and mitigating its potential misuse remains delicate and requires ongoing dialogue among technologists, artists, legal experts, and the public.
No doubt this year, AI has become extremely adept at generating images. But while this can be a very useful tool, it also raises questions our society doesn’t really seem capable of addressing yet.
The journey of AI in image generation, epitomized by models like Stable Diffusion and Dall-E, is not just about the technology itself, but about how we as a society choose to integrate and regulate these powerful tools. As we move forward, it’s crucial to foster an environment where AI can be used responsibly and ethically, ensuring that it serves as a complement to human creativity, rather than a replacement.
Ultimately the year 2023 will be remembered as a turning point in the world of digital art, thanks to the remarkable advancements in AI-driven image creation. As we explore this new landscape, we must do so with an eye toward the ethical implications, ensuring that this technology enhances rather than diminishes our human experience. The future of art, infused with AI, is in flux.