Image by Author | DreamStudio | Pharaser | Stable Diffusion
We live in exciting times where every week, we have announcements on cutting-edge technology. A few months ago, OpenAI dropped state of the art text-to-image model DALL·E 2. Only a few people got early access to experience a new AI system that can create realistic images from a description using natural language. It is still closed to the public.
A few weeks later, Stability AI launched the open-source version of DALLE2 called the Stable Diffusion model. This launch has changed everything. As people all over the internet were posting prompt results and getting amazed by realistic art.
The weights of the model are available at Hugging Face CompVis/stable-diffusion-v1-4. You can also check out the source code and model card. It is open to the public under The CreativeML OpenRAIL M licenses.
In this post, we will learn about Stable Diffusion and understand the need for a great prompt generator.
The Stable Diffusion model is the open-source state-of-the-art text-to-image model for creating generated art using natural language. It uses latent diffusion to recognize shape and noise and fetches all the elements to the central focus that are in sync with the prompt.
The model was trained on a LAION-5B imageset that consists of 5 billion publicly available images over the internet. The images come with captions and tags.
It took hundreds of high-end GPUs (Nvidia A100) to train the mode, and the training cost for Stable Diffusion is around $660,000. During the training process, the model correlates the words with images using CLIP (Contrastive Language–Image Pre-training).
You don’t have to train the model on your own. You can experience it free on Hugging Face Spaces and DreamStudio. You can even download the model weights and run it locally.
Hugging Face Spaces
Hugging Face – Stable Diffusion is amazing. Just write a simple description and click the generate image button. After a few seconds, you will see 4 generated images related to your prompt.
Image by Author | StableDiffusion | Hugging Face Spaces
Sometimes the image generation can take several minutes or even put you in queues due to high demand. It is fine for an unlimited free trial, but you can always check out the official Demo application called DreamStudio.
After signing up for a free DreamStudio account, you receive 2 dollars or 200 generations. It is fast, and you can play around with other options such as size, Cfg scale, seed, steps, and number of images. Your generated images are always saved in history, and you can use the API to integrate it with your existing applications.
As you can see, it took me a few seconds to generate a completely new image using the prompt.
DreamStudio | Author
Here is another example. I am a big fan of Lord of the Rings and Hobbits, so I thought, why not generate a 3d rendered image?
DreamStudio | Prompt: 3d hobbit’s world
You can add the style or even the platform name in the prompt. There are so many things you can try to generate a specific image. You can even write a long prompt describing all the details.
DreamStudio | Prompt: A dream of a red hair girl on artstation HQ
But how do you create detailed and high-quality images, as shown below? The real artists are now promoters who are imagining new characters and new worlds. They are using keywords to generate realistic generative art.
Image from nearcyan | hypothetical Marvel supervillains
If you want to become an AI artist and become famous, work on your imagination and write creative prompts. You also need tools to guide you and allow you to explore various styles, textures, colors, content, feeling, and era.
Phraser is the best prompt generator. Instead of trying different words, you will be selecting diverse options from varied sections, such as style and content type.
At the start, it will ask you to select neural networks such as DALLE2, midjourney, and Stable Diffusion. After that, content type, description, style, color, texture, resolution, camera settings, feelings, and era. After selecting the option, you will be provided with a prompt.
Image by Author | Steps taken to get prompt
You can either copy and paste the prompt to Hugging Face Spaces or connect DreamStudio with API.
It is easy to connect, and the guide to connect API is provided at the end of the prompt.
Image from Phraser | API key guide
The significant advantage of connecting API is that you get to experience Stable Diffusion results within the Phrase web application. You will save time in copying and pasting the prompt.
Prompt Generated using Phraser
We are entering into a new era of generative art, and every week we see that the community is bringing a new variation of the Diffusion model. For example, nateraw/stable-diffusion-videos generate videos by interpolating the latent space of Stable Diffusion.
“Look out for expert image and video prompter showcasing the skills on Twitter and Linkedin.”
In this post, we have learned about the Stable Diffusion model and how we can use free platforms like Hugging Face and DreamStudio to create AI-generated images. Furthermore, we have learned about Phraser, which assists you to write creative prompts/descriptions for the model.
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master’s degree in Technology Management and a bachelor’s degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.