Tips for good prompts

Gepubliceerd op 17 maart 2023 om 15:00

The best prompts

With all those options of AIs, choosing the right prompt can become quite complex. Overall, they yield the same results, but it's the details that differ. I didn't have the time to try them all myself. I searched for sources for the best prompts, and those sources had sources too... so here are a few sources that provide tips on how to create the best prompts.

Because the sources are all in English, you would get the impression that that is the only possible language, but that turns out to be incorrect. See the result here for a Dutch breakfast, written in Dutch: "Uitgebreid ontbijt, Hagelslag op volkorenbrood, pindakaas, jam, glas melk, [style charactaristics in English]by Stable Diffusion v1.5. Unfortunately, the AI doesn't know what hagelslag is, but otherwise it comes pretty close.

I have not yet tried to write the style characteristics in Dutch. Because everyone learns from each other in this field, and most of them communicate in English, I also stick to English terms here.

On to the most comprehensive guide to writing prompts.

Openart.ai promptbook

Openart.ai is a website that lets you browse a catalog of images by style, but it only has a limited number of styles in its library. What is recommendable is the https://openart.ai/promptbook, which gives a comprehensive presentation on how to get a good prompt, with lots of tips for beginners. For those who do not want to read everything, here are the highlights I've been cherrypicking, with some additions by myself.

Quality improvements

The most important trick to get a nice picture is to use words that indicate a quality improvement. Think of:

HDR (high dynamic range=more colours), UHD (ultra high definition), 64 K (resolution 61440×34560: 64K Digital Cinema). Especially with landscapes, this gives more detail and more depth to the image. Adding highly detailed does much the same for smaller-scale images such as a portrait.

If one object or person is your subject, add: 40 mm lens, shallow depth of field, close up, studio lighting. Where 'shallow depth of field' blurs the background, adding bokeh should fill in that blur more artistically.

For all cases with photography, make photograph -> professional photograph, and mention a high-end camera model such as Nikon 15mm f/1.8G in your prompt. Or write Canon lens, shot on dslr (digital single-lens reflex camera), 64 megapixels. Every genre seems to have its own standards for quality and professionalism this way.

Especially for topics related to games and the fantasy genre, it makes sense to add platform names, such as: trending on ArtStation, and for 3D or 4D models: Unreal Engine, Maya 3D, ZBrush or Blender (these are used to create 3D assets for film, television, games, and commercials). ZBrushCentral then refers to the platform on which you can show your 3D creations. In addition, you can refer to hardware and software to render creations such as Octane render (GPU render engine) or Vray. Specifically for an anime style image, choose Anime Key Visual.

Of a completely different nature is the addition of high resolution scan. This makes the photo appear historical, as if it was scanned later. Yes, instead of going for the sharpest possible image, you can also go for an outdated look for content-related or artistic reasons. For example, create an old photo by adding sepia yellow monochromatic vintage 1900s photograph after your subject.

Tricks for advanced users: if you want a black and white photo with 1 colored object, use: color splash. Choose double exposure if you want to depict a second subject in the contour of your first subject. Stable Diffusion version 1 also responds to emojis ❄️🌨️ and punctuation marks such as ((( and !!!, version 2 less so.

Prompt: "sumi-e panda eating bamboo" does not produce a nice picture and certainly not in sumi-e style (Japanese ink drawing). This is Stable Diffusion v1.5.

prompt:  "sumi-e panda eating bamboo" - Weight:1 

"detailed matte painting, deep color, fantastical, intricate detail, splash screen, complementary colors, fantasy concept art, 8k resolution trending on Artstation Unreal Engine 5" - Weight:0.9

This is the default NightCafe style, containing many quality words, running on Stable Diffusion 1.5. This already gives a nicer picture, but not yet in sumi-e style.

Structure

Overall, the structure of your text prompt is as follows:

[subject]  [in surroundings]  [medium]  [art movement]  [in style of]  [POV]  [light]  [colour]  [quality words]

Not everything needs to be filled in. [medium] can also be placed at the very front if its execution is most important. The other aspects should also be moved further forward if the impact is too low for your liking.

Examples:

  • subject = person, animal, landscape, etc. doing something, or wearing something, or being something (happy, sad, romantic), e.g. 'happy panda eats bamboo'
  • in surroundings = in his room, on the mountain, in the woods, indoor, outdoor, underwater, space
  • medium = photo, painting, chalk drawing, watercolor, clay, crochet, origami, or other choice, but be specific
  • art movement = impressionism, film noir, baroque, post-apocalyptic, minimalism, modern art, ink painting, movie poster, naïve art, pointillism, pop art, splash art, storybook illustration, street art, surrealism, Ukiyo-e ... and this is just a small selection of the possibilities
  • in the style of : by Vermeer, Van Gogh, Arthur Rackham, Hayao Miyazaki , Alphonse Mucha, Albrecht Dürer. ... this list is in principle endless but not every artist is equally well known; Studio names also work well: Studio Ghibli, Pixar, Disney
  • POV (point of view): front, overhead, side, close-up, satellite, macro, telephoto, fish-eye lens, polaroid, long exposure, Gopro, drone photo, selfie
  • light: soft, ambient, ring light, neon, at night, sun, cinematic, sun rays, sundown, sunrise
  • colour: vibrant, dark, pastel, muted, monochrome, tetradic, triadic, warm colors, cool colors, and the colors themselves of course
  • quality words: as mentioned in the paragraph above.

prompt suggested by a prompt generator:

"sumi-e panda eating bamboo, Portrait, Artstation, Painting, Beautiful, Illustration" - Weight:1

"ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft" - Weight:-3

on Stable Diffusion v1.5

The same prompt but executed on Stable Diffusion v2.1 already gives a much better impression of the sumi-e style.

Experiment

Noise

Everything affects everything. If you choose artists with a detailed style, it is no longer necessary to ask for hyperdetailed, or intricately detailed. If your subject doesn't match the artists, an AI like Stable Diffusion will fall back on a photorealistic style. It makes sense to experiment extensively with this. The style words that work well for one subject in the prompt may therefore work very differently for another subject.

Note that when changing the proportions of your image (square to landscape for example) everything can change, even with the same prompt and seed. The AI are trained on a standard size of 512x512 pixels, so square. You do get equivalent colors and composition with landscape or portrait formats.

Always keep in mind that the AI is trained with 2000 years of art. It can be very cumbersome to properly represent a modern subject in a very old medium. Sketch, for example, will draw in grayscale, and will present your subject in a modern way - think of 'car' as 'Tesla'; painting gives a colorful image, but a more old-fashioned representation of your subject - 'car' quickly becomes a 1960s model.

In addition to the text prompt, there are additional settings depending on the AI. At Stable Diffusion on NightCafe you can set the seed, which determines the noise at the beginning. With the same seed and prompt you will always get the same composition. The ratio between noise and prompt is adjustable with Overall Prompt Weight, by default 50%. You give the AI more freedom by lowering the prompt percentage, but the result can become a bit unstructured. Increasing it is also possible, but too much will look amateurish. Enabling CLIP Guidance can help improve image results for complex prompts or larger resolutions. It can also make images look more realistic. When CLIP Guidance is enabled, the AI uses K_DPM_2_ANCESTRAL as the sampling method. In addition, there are ten other methods (wow, this sounds like a topic for a whole new blog).

AIwiki.ai

On Aiwiki.ai, the current world of AI is explored and explained. Relevant here is https://aiwiki.ai/wiki/Prompt, where it explains how prompts work, similar to the explanation above. The website also provides a list of prompt generators if you need a little help or inspiration.

  • Midjourney Prompt Generator: unofficial Midjourney prompt builder, on the HuggingFace platform. It complements your subject with extra details and control commands such as --ar 3:2 (aspect ratio).
  • Phraser: assists in making strong prompts for Midjourney and DALL-E. Even if you don't log in (and don't pay), you can look at example images and the texts used, to get inspired.
  • MidJourney Prompt Helper: text-to-image prompt builder developed for Midjourney and DALL-E. It walks you through the prompt-building step by step by letting you click pictures of the styles you want. If you take out the control commands for Midjourney, it also becomes usable for Stable Diffusion.
  • Drawing Prompt Generator: helps draftsmen get ideas, even if they don't work with AI. This is only aimed at the topic of your prompt.
  • Promptomania Builder: prompt builder for various AI art generators. It works for most CLIP and VQCAN based models. Again, click on images to specify your style.
  • MidJourney Random Commands Generator: unofficial Midjourney prompt generator for complex outputs. This helps especially with writing the control commands that can be given to Midjourney.
  • Lexica.art: Lexica offers its own Stable Diffusion AI, but you can also search in the already created images. You can see the prompt for each picture, so you can learn a lot here.

prompt chosen on the basis of style pictures:

"panda eating bamboo, sumi-e by Katsushika Hokusai, by Zeng Fanzhi, Ink, Filmic, Portrait, Tonal Colors, Dark, 2.5D" - Weight:3

"ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft" - Weight:-0.3

executed on Stable Diffusion 1.5 More Ukiyo-e style than sumi-e, but that's already Japanese.

Other prompt generators

Through several smaller sources, I came across other prompt generators. Here the most interesting.

  • CLIP Interrogator and CLIP interrogator2 are both on the HuggingFace platform. Here you upload an image and the tool tries to figure out what would be a good prompt to create new images like this one.
  • OpenAI ChatGPT This AI Chatbot can also be used to design prompts for AI. Use "give the adjectives of something...", "describe something in detail..." to get some keywords. Please specify for which AI you want to use this.
  • Random AI Prompt Generator gives you random prompts, mainly targeting Midjourney.
  • Clip front works by turning the text into a CLIP embedding and then using that embedding to show more images and/or keywords.
  • ArtHub.ai has a prompt library where you can find inspiration for each style.
  • PromptBase Prompt Marketplace for DALL·E, GPT-3, Midjourney, Stable Diffusion. Here you can buy prompts, but this is not very useful, because the execution of your prompt changes if the subject changes too much.

The same prompt "panda eating bamboo, sumi-e by Katsushika Hokusai, by Zeng Fanzhi, Ink, Filmic, Portrait, Tonal Colors, Dark, 2.5D" - Weight:3

with a stronger value for the negative part in Stable Diffusion 2.1, because I know that version 2.1 needs it more than version 1.5. Not perfect pictures yet, but closer to sumi-e.

Other platforms

Also worth noting is the subReddit on Stable Diffusion. Here you will also find references to prompt generators and instructions on how to write your own prompts. The great thing about a subReddit is the active community behind it, which increases the likelihood of the content being kept up to date.

I myself stick to the NightCafe platform, where I learn a lot from my fellow users. If I find a term there that I don't know, I Google it to be able to assess whether I would like to use it myself. A picture says more than a thousand words.

In this blog I have used a panda to show the influence of the different words in the prompt. Hopefully it inspires you to create lots of new images on the platform of your choice.

"panda eating bamboo, sumi-e by Katsushika Hokusai, by Zeng Fanzhi, Ink, Filmic, Portrait, Tonal Colors, Dark, 2.5D" on Dall-e2

Reactie plaatsen

Reacties

Er zijn geen reacties geplaatst.