The art of artificial intelligence is everywhere now. Even experts don’t know what that means

Art prize at the Colorado State Fair was granted Last month to a work – unknown to judges – created by an artificial intelligence (AI) system.

Social media has also seen an explosion of bizarre AI-generated images from text descriptions, such as “Shiba Inu’s face mixed with a loaf of bread on the kitchen table, digital art.”

Or maybe the “sea otter in the style of” Girl with a Pearl Earring by Johannes Vermeer:

‘Navy otter in the style of ‘Girl with a Pearl Earring’ by Johannes Vermeer.
open ai

You may be wondering what is going on here. As someone who researches the creative collaboration between humans and artificial intelligence, I can tell you that behind the headlines and memes there is a fundamental revolution underway – with profound social, artistic, economic and technological implications.

How did we get here

You could say that this revolution began in June 2020, when a company called OpenAI made a huge advance in the field of artificial intelligence by creating GPT-3, a system that can process and generate language in much more complex ways than previous efforts. You can have conversations with him on any topic, ask him to write a research essay or story, summarize the text, write a joke, and do almost any language task imaginable.



Read more:
Robots create pictures and tell jokes. 5 things to know about foundation models and the next generation of AI


In 2021, some GPT-3 developers got their hands on images. They trained a model on billions of pairs of images and text descriptions, then used it to generate new images from new descriptions. They called this system DALL-E, and in July 2022 they released a new version that was much improved, DALL-E 2.

Like the GPT-3, the DALL-E 2 was a major breakthrough. It can create highly detailed images from free text input, including information about style and other abstract concepts.

For example, here I asked to clarify the phrase “Mind in Bloom” that combines the styles of Salvador Dali, Henri Matisse, and Brett Whitley.

Image created by DALL-E from a quick “Mind in Bloom” that combines the styles of Salvador Dali, Henri Matisse, and Brett Whitley.
Rodolfo Ocampo / Dal-E

Competitors enter the scene

Since the launch of DALL-E 2, few competitors have emerged. The first is the free but low-quality DALL-E Mini (independently developed and now renamed Crayon), which has been a popular source of meme content.

Images created by Craiyon from the post “Darth Vader Riding a Tricycle Outside on a Sunny Day”.
Crayon

Around the same time, a smaller company called Midjourney It released a model more compatible with the capabilities of DALL-E 2. Although Midjourney is still slightly less capable than DALL-E 2, it has led to interesting technical explorations. With Midjourney Jason Allen produced the artwork that won the Colorado State Fair Competition.

Google also has a text-to-image form called Imagine, which supposedly yields much better results than DALL-E and others. However, Imagen has not yet been released for wider use, so it is difficult to evaluate Google’s claims.

Images generated by Imagen’s text-to-image form, along with the text that generated them.
Google / Image

In July 2022, OpenAI began taking advantage of the interest in DALL-E, announcing That 1 million users will be granted access on a pay-for-use basis.

However, in August 2022 a new competitor arrived: stable spread.

The stable deployment not only rivals DALL-E 2 in its capabilities, but most importantly, it is open source. Anyone can use, adapt and modify the code as they like.

Already, in the weeks since the release of Stable Diffusion, people have been pushing the code to the very limits of what it can do.

Let’s take one example: people quickly realized that because a video is a series of images, they can modify the Stable Diffusion code to create a video from text.

Another great tool built with Stable Diffusion code is Post the rest, which lets you draw a simple sketch, deliver a text message, and create a picture of it. In the video below, I’ve created a detailed picture of a flower from a very rough sketch.

In a more complex example below, I’m starting to build a program that lets you draw with your body, then use Stable Diffusion to turn it into a painting or photo.

The end of creativity?

What does it mean that you can create any type of visual content, image or video, with just a few lines of text and with the click of a button? How about when you can create movie script with GPT-3 and movie animation with DALL-E 2?

Looking to the future, what will it mean when social media algorithms not only curate the content for your feed, but also create it? How about when this trend meets the metaverse in a few years, and virtual reality worlds are created in real time, just for you?

These are all important questions to consider.

Some speculate This means, in the short term, that human creativity and art are severely threatened.

Perhaps in a world where anyone can create any images, graphic designers as we know them today would be redundant. However, history shows that human creativity finds its way. Electronic synthesizer didn’t kill music, photography didn’t kill painting. Instead, they stimulated new art forms.

I think something similar will happen with the AI ​​generation. People are experimenting with including models like Stable Diffusion as part of their creative process.

Or use DALL-E 2 to create couture prototypes:

A new kind of artist is emerging in what some call “instant” orfast engineeringThe art lies not in making the pixels by hand, but in crafting the words that drive the computer to create the image: a kind of whisper of artificial intelligence.

Collaboration with artificial intelligence

The effects of AI technologies will be multidimensional: we cannot reduce them to good or bad on a single axis.

New art forms will emerge, as will new ways of creative expression. However, I think there are risks, too.



Read more:
Here’s How It Feels When Bots Come to Your Business: What GitHub’s AI Assistant Means for Programmers


We live in an interest economy that thrives by extracting screen time from users; In an economy where automation drives corporate profits but not necessarily increases wages, and where art is commodified as content; In a social context where it is difficult to distinguish between the real and the fake; In the social and technical structures that easily encode the biases in the AI ​​models we train. Under these circumstances, AI can easily do damage.

How can we direct these new AI technologies in a direction that benefits people? I think one way to do that is AI design It cooperates with humans rather than replacing them.

Leave a Comment