With my first AI-generated images, I mainly tried to find out what it knew and what it could do. Knowing that the different AIs have been trained with images from all over the internet gives you high expectations, but you really don't know anything yet. My traditional Northern Dutch name 'Harmanna' immediately turned out to be difficult, because it is not given much, even in the Netherlands, and is therefore hardly known on the internet. The AI thought of everything (dishes, villages, inscriptions) but not a woman's name, so I had to help a bit.
On Nightcafe Creator they have a number of presets to make it easy for beginners to write a good text prompt, including 'Artistic Portrait' and 'Color portrait'. The first preset contains words such as 'head and shoulders portrait, 8k resolution concept art, dynamic lighting, hyperdetailed', and the second includes 'Close-up portrait, color portrait, Linkedin profile picture, professional portrait photography'. In short: painted and photographed. So let's try.
This is the first attempt: 'Harmanna' Artistic Portrait in Stable Diffusion version 1.5. Realize that you will see something new with every attempt, but the images will fall into the same category. Apparently this set of instructions, summarized as 'artistic portrait', gives a great preference for making a female portrait. Now I'm not going to show the outcome of all my tests, but if you give a Western name like 'Anne' or a typical Dutch name like 'Femke', then it will only be ladies with a light skin color. So an exotic name helps to also portray people with a dark skin color, but 'light' seems to be preferred.
If you give an obvious boy name, such as 'Peter' or 'Tom', you will get a man. A young man. With light skin color (because of the European name). Who looks a bit sultry into the camera. This seems partly due to the notion of 'concept art', which is incorporated in the prompt, whereby the artist presents his ideas as attractively as possible to a client. Attractive always seems to mean: young adult, and if possible a little sensual.
My second attempt was a portrait of 'Harmanna' in color photography, also in Stable Diffusion version 1.5. Would there also be a preference for women here? And for white? Apparently not. The example shows a mild preference for men, a strong preference for a darker complexion. The people in the generated portraits seem to come from India and the surrounding area. In addition, professional portrait photography does not seem to have such a preference for young and sensual.
LinkedIn portraits are also a source for the AI of portrait photography. I don't know whether there are many or few people of color on it. The name 'Harmanna' seems to indicate an Asian connection at least for the AI.
What strikes me most is that none of these ladies and gentlemen look like me. Not only in terms of appearance - which makes complete sense - but also not in terms of style. I don't walk in a sensual concept art style of clothing and I don't wear scarves. Also when I look at what portraits others are making with the different AI, I don't come across an image that is representative of myself. Is it partly up to the users of the AI that by default women are portrayed as young - with a tendency to dress scantily or like a princess?
In the meantime, various providers of this AI have disclaimers on their website. They recognize that the prejudice that circulates on the internet has also crept into their own product as a result.
Google made an improved version of DALL-E and called it Imagen. Imagen is not available to the general public, partly because it is full of prejudices.
According to the website ( https://imagen.research.google ):
...datasets of this nature often reflect social stereotypes, oppressive viewpoints, and derogatory, or otherwise harmful, associations to marginalized identity groups. While a subset of our training data was filtered to removed noise and undesirable content, such as pornographic imagery and toxic language, we also utilized LAION-400M dataset which is known to contain a wide range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes. Imagen relies on text encoders trained on uncurated web-scale data, and thus inherits the social biases and limitations of large language models. As such, there is a risk that Imagen has encoded harmful stereotypes and representations, which guides our decision to not release Imagen for public use without further safeguards in place.
Stability AI has also delivered new versions of Stable Diffusion, in short succession version 2.0 and 2.1. In addition to training with even more images, they also tried to make the AI look less at the performing artist and more at the specified techniques. In addition, less spontaneous exposed body parts. Below their own comments via https://wandb.ai :
The release of Stable Diffusion 2.1 answers some of the criticisms the 2.0 release received, particularly on people generation.
Stable Diffusion 2.0 training used an overly aggressive NSFW [=Not Safe For Work] filter to remove adult material from the training data. While that decision itself was controversial for many reasons, the result was that humans were less represented in the training data, resulting in poorer human portrayal overall (NSFW or otherwise). That, plus the broader focus on non-human subjects in the training data, had many users disappointed.
For Stable Diffusion 2.1, the NSFW filter was toned down so that it produced fewer false positives. The filter still exists, so compared to previous Stable Diffusion versions, the more playful end of the human view may still be limited, but overall Stable Diffusion 2.1 is now better at generating people than 2.0.
Of course I also tested in Stable Diffusion v2.1 how 'Harmanna' was displayed, and 'Artistic Portraits' were displayed. My name is still a mystery to the AI, that much became clear quickly. I'll spare you the artistic portrait rendering, because that looked like a plastic Barbie doll - the filtering out of NSFW material still clearly left its mark here. The photographic color portrait (pictured here) now gives as many men as women, still with a completely different cultural background than myself due to my unusual name.
The concept art behind the artistic portrait still focuses on young and seductive, but the portrait photography is multicultural and of many ages. So it's mostly a matter of writing the right text prompt to pull out positive portraits of diverse people. Personally, I am not interested in making portraits of yet another princess.
Seeing myself represented
So now people who look like me. I tried that. My age group is difficult to represent in my favorite style, the illustration, so I went for something slightly older. In addition, I am a woman without a dress. Stable Diffusion version 1.5 didn't allow a negative prompt a few months ago, so 'without dress' had to be reworded.
It took some searching, but it worked. The key words for this text prompt were "tough grandma," in all meanings as "resilient, robust, sturdy, resistant, rugged, solid, strong, firm, able-bodied" grandma. They even look a bit like my mother, so I'm completely satisfied. This is what I hope to be in ten or twenty years — or thirty.