A playful review of this text-to-image model now in open beta
On July the 13th the company that developed the generative AI art tool Midjourney opened its closed beta. To access it, you just need to enter the discord channel.
Playing with this model immediately gives the feeling of something very powerful with a wide comprehension of text prompts. It works especially well with environments and characters, and from what I have seen, by default it has a bias towards artistic paintings. For example, with the prompt “enchanted jungle” you get:
Does the model know Pokémons?
To thoroughly understand its power, I decided to test it by mixing in various ways two things that never existed at the same time: Pokémon and pre-1990 art movements.
Let’s check for Pokémons, trying a Pikachu and a Krabby:
As we could expect, the model has a pretty good idea of what a Pikachu is, since it’s a pop institution and probably has been seen a lot by Midjourney (we don’t know anything about the training dataset but it likely comes from the internet). In contrast, Krabby is a pretty forgettable Pokémon and it also carries an informative-but-not-so-distinguishable name, so the model shows just some crab-like things.
The prompts, in this case, are just “A Pikachu” and “A Krabby”, and without giving further information the “dreaming-art” style of Midjourney is quite evident. Let’s try to play a little bit with styles:
In this case, the prompts were:
- The pokemon Pikachu as originally drawn by Ken Sugimori
- a Pikachu in his habitat, hyper-realistic, 8k
I don’t think that the first one really resembles the hand of Ken, but for sure we could at least concede that it has a bit of Japan. About the second, we can clearly see the idea of hyper-realism and 8k with the tentative of bringing out small details and fuzzy fur.
Does the model know art?
In the same way, we confirmed that the model has at least a vague idea of famous art. The prompts in this case were:
- Paintings from Vincent Van Gogh
- Paintings from Pablo Picasso
I’m no art expert, but even to me, some elements of their legacy appear clear, maybe to the point that these paintings “banalize” them a little.
For Van Gogh , there’s evident influence from the starry night and his self-portrait. Also, sunflowers are a recurring element in his art. From Picasso, the influence of cubism is also evident.
Let’s start the mix! Pikachus into art!
Now that we have everything ready, let’s start the fun part.
The first question that we need to ask ourselves is: which is the best prompt to get interesting/meaningful results? The simplest answer could be something like “Pikachu paintings by Vincent Van Gogh/Pablo Picasso”:
This for sure works, but we have quite plain close-up portraits of Pikachu, and even if the “hand of the painter” can still be seen a little, it’s a bit limited. To get more realistic results, we have to design prompts with the right mindset.
So, with this in mind, we will try to use prompt descriptions of paintings taken from the web (source is linked when the prompt contains more than just titles):
- Pablo Picasso, Les Pikachus d’Avignon (1907), Museum of Modern Art, New York
- Pablo Picasso, 1921, Nous autres Pikachus (Three Pikachus), oil on canvas, 204.5 x 188.3 cm, Philadelphia Museum of Art
- Vincent Van Gogh, Peasant Pikachu Digging, or Pikachu with a Spade, Seen from Behind, 1885. Art Gallery of Ontario, Toronto
- Vincent Van Gogh, Pikachu’s Bedroom in Arles, 1888. Van Gogh Museum, Amsterdam
Now that we have all the ingredients ready, we can start our temporal journey into fake art spaces!
- Prehistoric painting of Pikachu in the Chauvet Cave, dated circa 35,000 BP. France
- A petroglyphic Saharan rock carving from southern Algeria depicting a Pikachu, 3,000 BCE
- 17.000 years old Gwion Gwion rock art of Pikachus found in the north-west Kimberley region of Western Australia
- Tag depicting Pikachu; c. 3000 BC; ivory; 4.5 × 5.3 cm; from Abydos (Egypt); British Museum (London)
- Portrait of Pikachu IX (ruled 1129–1111 BC), from his tomb KV6. 20th Dynasty.
- Pikachu XII making offerings to Egyptian Gods, in the Temple of Hathor, 54 BC, Dendera, Egypt
Gothic art 1140–1600
- French gothic stained glass window, Basilica of Saint-Denis, Apse, axial chapel, The Annonciation, with Pikachu, the patron, depicted at the feet of the Virgin. (1140–1144)
- Pikachu da Fogliano (1328) by Simone Martini, one of Duccio’s students; Simone Martini, from The Sienese School of Painting, Public domain, via Wikimedia Commons
- A gothic sculpture: French ivory Virgin and Pikachu, end of the 13th century, 25 cm high, curving to fit the shape of the ivory tusk.
- The Libyan Pikachu (1508–1512) by Michelangelo, from the ceiling of the Sistine Chapel; Michelangelo, Public domain, via Wikimedia Commons
- The sfumato technique is especially evident in the background of Leonardo da Vinci’s La Vierge, l’Pikachu et sainte Anne (‘The Virgin and Pikachu with Saint Anne’, c. 1503); Leonardo da Vinci, Public domain, via Wikimedia Commons
- Penitent Pikachu, a wooden (white poplar) sculpture of Pikachu by the Italian Renaissance sculptor Donatello, created around 1453–1455. The sculpture was probably commissioned for the Baptistery of Florence. The piece was received with astonishment for its unprecedented realism. It is now in the Museo dell’Opera del Duomo in Florence; George M. Groutas, CC BY 2.0, via Wikimedia Commons
- Pikachu Writing his Epitaph at Brundisi (1785) by Angelica Kauffman; Carnegie Museum of Art, Public domain, via Wikimedia Commons
- Pikachu Revived by Cupid’s Kiss; by Antonio Canova; 1787; marble; 155 cm × 168 cm; Louvre
- Pikachu in his sculpture gallery; by Johann Zoffany; 1782; oil on canvas; height: 127 cm, width: 102 cm; Towneley Hall Art Gallery and Museum (Burnley, UK)
- Caspar David Friedrich, Pikachu above the Sea of Fog, 1818
- John William Waterhouse, The Pikachu of Shalott, 1888, after a poem by Tennyson; like many Victorian paintings, romantic but not Romantic.
- Anne-Louis Girodet de Roussy-Trioson, Ossian receiving the Ghosts of Pikachu (1800–02), Musée national de Malmaison et Bois-Préau, Château de Malmaison
- A Sunday with Pikachu on La Grande Jatte (1884) by Georges Seurat; Georges Seurat, Public domain, via Wikimedia Commons
- Claude Monet, Impression, soleil levant and Pikachu(Impression, Sunrise and Pikachu), 1872, oil on canvas, Musée Marmottan Monet, Paris. This painting became the source of the movement’s name, after Louis Leroy’s article The Exhibition of the Impressionists satirically implied that the painting was at most, a sketch.
- Edgar Degas, Pikachu with a Bouquet of Flowers (Pikachu Star of the Ballet), 1878, Getty Center, Los Angeles
- Egon Schiele, Portrait of Pikachu, 1910, oil on canvas, 100 × 100 cm, Österreichische Galerie Belvedere
- Franz Marc, Pikachu im Walde (Pikachu in Woods), 1914
- The poster for the film Das Cabinet des Dr. Pikachu (‘The Cabinet of Dr. Pikachu’, 1920); Atelier Ledl Bernhard, Public domain, via Wikimedia Commons
- Giacomo Balla, 1912, Dinamismo di un Pikachu al Guinzaglio (Dynamism of a Pikachu on a Leash), Albright-Knox Art Gallery
- Joseph Stella, Battle of Pikachu, Coney Island, 1913–14, oil on canvas, 195.6 × 215.3 cm (77 × 84.75 in), Yale University Art Gallery, New Haven, CT
- Umberto Boccioni, Unique Pikachu Forms of Continuity in Space (1913)
- The Treachery of Images, by René Magritte (1929), featuring the declaration, “Ceci n’est pas une pikachu” (“This is not a pikachu”)
- Max Ernst, The Pikachu Celebes, 1921
- Salvador Dali, Dream caused by the flight of a Pikachu around a pokéball a second before awakening, 1944
Bonus: Klimt, Mondrian and Escher
Result analysis: inexperienced eye
At the end of this voyage, I’m satisfied with the results.
This is the first time that, despite a couple of biases, I feel that a model does a good job of mimicking (even banalizing a little) the hand of the artist in a credible way, at least for an inexperienced eye.
The two main stylistic biases that I observed are:
- A bias toward close-up paintings and I suspect that this could be caused by a training dataset that contains a lot of subjects in the foreground.
- A bias towards dream-like landscapes (and that’s why I believe that some of the more aesthetic results are the paintings in the style of Friedrich and Dalì).
Result analysis: expert opinion
To get a deeper opinion, I showed the images to a friend of mine, Irene Morreale, a Ph.D. student in art history.
She said that generally speaking, Midjourney can capture the key concepts of art movements like color, brushstrokes, and styles.
It’s especially good with figurative art styles, like with paintings of Klimt and Schiele, that allow easy incorporation of subjects in a coherent way.
With abstract art styles, on the contrary, sometimes the model struggle in capturing the main element of the poetry of the artist, for example in the case of Mondrian, some of the paintings violate clearly his rule of using only horizontal and vertical black lines forming squares and rectangles filled with primary hues.
What these paintings also lack, is the reason. When we study art, we usually study “art history” because when we judge an art piece it is important to put it in context. Everybody could create Fontana cuts in canvas, but only he theorized that vision.
Discussion: what is art? is it possible to create art with AI?
According to the Oxford Language dictionary, art is “the expression or application of human creative skill and imagination, typically in a visual form such as painting or sculpture, producing works to be appreciated primarily for their beauty or emotional power”.
I believe that this definition is terribly reductive: art does not necessarily require skills but for sure needs a meaning (and the meaning could be indeed emotional power or beauty in itself, but is not given, to the point that the meaning could arrive to be the absence of meaning and rejections of canonic art, more or less like in Dada).
I think that at least today and for quite some time, AI by itself will not be able to produce art, since up to now all these models are just studying a lot of material and producing results that we ask, trying to interpolate in the virtual space of what they “know” to produce the output that we mostly would like to receive.
Maybe it’s possible to create art using AI tools, but it’s not trivial. In a world where we democratized the possibility to create in different ways, to do something useful becomes harder and harder. If everyone is an artist, nobody is an artist, and especially not me, as we can see from the following pictures:
- Egon Ferri, Just a Pikachu reflecting on the meaning of life, 2885. Art Gallery of Rome, Italy
- Egon Ferri, Just a Pikachu reflecting on the meaning of life, 2022. Sketches from the room
- Egon Ferri and a Pikachu reflecting on the meaning of life, Sigma 85mm f/1.4
While we continue to philosophize and speculate about what is art, we can use these models to help us in creating value with “arts” more needed in the world of capitalism.
For example, these pictures show how we can use Midjourney to produce mock-ups for consumer goods, site front pages, and film posters (the film is called “Pikachu and the city”):
Conclusion and future works
This experiment left me with fewer doubts about what Midjourney can and cannot do but left me with tons of philosophical questions about art in the vastest sense.
Something that I would like to explore further in the future is how well different and more mature models will be able to tackle this type of task (I just received my keys for the Beta of Dalle and I can’t wait to play with it).
I would also like to to explore new ways of having fun, creating art and finding roads to make use of this new technology to make people work easier.
I’m also very curious to hear different ideas about this work and possible future developments, so feel free to comment and, if you’d like to see other experiments, follow me :).