The Wimmelbook Test

Leor Grebler
3 min readNov 6, 2023

--

Generated by author using DALL-E 3

A few weeks ago, I finally was able to try OpenAI’s new DALL-E 3’s image generation tools. My standardized test is to use a prompt that generates a wimmelbook (e.g. Where’s Waldo? style) image. I love these books and as a child, could get lost in the complexity, seeing all the stories that were involved that were being captured in a God-view snapshot.

My grandmother made tapestries depicting stories from her shtetl and more recently, my mother started painting complex scenes. My son also loves looking over books of these drawings and telling me what’s happening in the scenes.

The wimmelbook test is great at quickly seeing if a generative AI can synthesize the prompt and create complexity. It’s a sort of coherence test. On Midjourney, the latest version seems to create more stylized images than ones that are cartoonish and loyal to the genre.

The prompt I used on DALL-E 3 was very simple: “draw a wimmelbook diagram of a factory”. However, the service seemed to be able to refine the prompt behind the scenes to present something much better than if it were to create it directly based on my prompt. It’s response:

“Here are the wimmelbook-style diagrams of a factory, showcasing various activities, intricate details, and hidden stories. I hope they capture the bustling and detailed nature of a factory environment in the wimmelbook style. Let me know if there’s anything else you’d like or if you have further requests!”

Midjourney stumbled with such a simple ask:

Generated by author using Midjourney

Midjourney still creates a very cool and intricate image, but with the simplicity of the prompt, it flops in generating something that resembles a wimmelbook.

Dreamstudio based on Stable Diffusion?

Generated by author using Dreamstudio

Leonardo AI?

Generated by author using Leonardo.AI

Imagine.Art seemed to take the assignment too literally:

Generated by author using Imagine.art

Now, let’s go back to a few other DALL-E 3 images using the same test:

Images generated by author using DALL-E 3

These images are much more aligned to the intention, augmented by DALL-E 3 knowing how to expand the prompt to create a better result. The wimmelbook tests also shows whether the details remain coherent and what type of artifacts the generator creates.

I’m looking forward to employing this on future builds of different services.

--

--

Leor Grebler

Independent daily thoughts on all things future, voice technologies and AI. More at http://linkedin.com/in/grebler