AI art should do better (why I trained Dreambooth on my last 2 years of drawings)

16 November 2022

After Stable Diffusion was made available to run locally, letting people use AI art without abusive terms and conditions, I decided to try it. Before long I fell into the rabbithole.

Human vs AI mascot.

I’m not an artist, I’m a programmer; I only draw when I have time to do so, which for some months might be “none at all”. Sometimes I use my work leaves to draw, which has the advantage of being both much more practical to organize and much less costly than traveling to another corner of the globe. As an introvert, I’m more interested in inner exploration and learning, so I try to draw resources from within to draw better, striving towards artistic creation. I’d like to reach certain milestones, and view most of my finished drawings as exercises more than anything else.

When Covid hit, many people were dealt a similar set of cards, and turned way more towards creative endeavors. This is when Cosmoose was created, as a music band emerging from 2 friends and the great IMF community. I was later assigned resident artist in charge of the visual side.

This changed the perspective on drawing for me, as I had to:

illustrate single covers and album booklets
develop the lore (the ‘cosmooverse’)
create avatars for each cosmooverse persona

Instead of having art as a self-referential goal, it became a means of expression. As a result of that creative process, we developed a number of projects:

storybooks, bundled with each release
a text adventure game
a remix album involving more than a dozen musicians from IMF
a funny blogpost about Stable Diffusion, an emote pack, videos, 3D models, …

The intention is to have our very own universe, where everything is under creative commons license (our mascot is named CC, and it is a coincidence), sharing everything from drawing sources to music stems. Everything we work on is a little miracle of creative interaction with a beautiful exchange of ideas. Thanks to that, this is something that we’ll be able to pass to our children, without the fear of any corporation destroying it in the meantime, doing insulting cashgrabs, or sketchy projects on top of it.

As a result, I often find myself as a bottleneck for some of our works; we have stories that need illustrations, but so far only have a page each scribbled with thumbnails for what they should look like. We released a delightful text adventure game for which I did the pixelart, but there’s a shmup I’d really like to do. And so much more.

Most of my drawing happens digitally (in Krita), as drawing digitally speeds up the process tremendously. I learned some Blender to make 3D models for recurring spaceships. So if there is anything that can speed up the process even more, I will gladly use it. I mentioned on hn that Stable Diffusion was not really producing any decent results for cartoon drawings. A reply stated that Dreambooth was the way to go, so I tried to train it:

from Stable Diffusion 1.5, 4000 steps hugging face
from Anything v3 (which is already much better at anime style), with 3000 steps hugging face

And the results are kind of underwhelming. There isn’t anything I see as really usable or inspiring. (Hover to see full size images.)

Cosmoose AI art.

In many cases, it’s downright insulting, and/or nightmare fuel. These are obtained by solely giving the cosmoose style as prompt.

Cosmoose AI art.

In some cases, it gives a bad version of training data, as in the following comparisons:

Cosmoose AI art comparisons.

In terms of design, it’s less inspiring than doing an image search and combining different subjects yourself. Selecting prompts with “octane render, masterpiece, detailed”… (you get the gist) give notably better results.

Cosmoose AI art.

Still, there isn’t much that can be used from that; I’m not interested in the AI fixing composition, color scheme or any of the big ‘artistic’ choices, since that’s the interesting part (for me, at least). In many cases the perspective is subtly broken; it is most often globally broken but locally coherent, something of a dreamlike quality, but which requires to mostly redo the fundamentals.

The issue is also that as ‘an artist’ (in some way) I’m not interested in getting any decent result, I’m interested in achieving a certain vision within any piece. But ‘AI art’ tends to be really exploratory, you influence the output with the prompt but it fundamentally is an arbitrary random process¹. There are ways to make it less arbitrary (using inpainting etc) but then it takes a lot of time and work, at this point you might as well ‘just draw’.

Given the current state of things, I don’t think this is a revolution for professional artists, at all; certainly not compared to what photobashing or 3D actually changed to the art pipeline. The main thing an artist brings to a project is an overall vision, with a certain design language (a constraint framework). Moreover, they have to fit in with precise requirements of a certain project, and adapt smoothly as the requirements evolve (sometimes drastically). In my much more limited scope, I will probably use elements (a background planet, things like that) in a limited fashion. For instance, this song illustration’s background is mostly a composite of 3 different alien desert landscapes generated by Stable Diffusion 1.4. This is where the AI shines, and in many ways it does not interfere with my normal process. This saved quite some time (although it should still be cleaned up).

Heroic.

Even if it was a fun experiment, this is not sufficient to replace me unfortunately. It doesn’t really affect either my internal motives towards ‘art’, nor the external ones (such as Cosmoose). I hope the technology improves so that I can finally direct a robot me to do the boring stuff. So please, AI overlords. Do better.

That makes me think that with Hilbert’s program, many envisioned mathematics the same way, where a program would enumerate all theorems and people would just pick the good ones. ↩