Captioning AI-Generated Images

1 min readFeb 1, 2023

Another great find by my brother was Google’s MusicLM. It’s fantastic what they’ve been able to do with generative AI and music. You can check it out here: https://google-research.github.io/seanet/musiclm/examples/ — more on that in the future.

One example Google provides for generative music is using the description of a piece of artwork as the prompt for generating music. It made me think about what would happen if you used a caption generating tool on an AI-generated image. Would it be accurate?

What would then happen if you either 1) put that description back into an image generation tool or 2) used a tool like MusicLM to write a song?

In the first, it would be an interative loop that could produce very interesting outputs. Would it converge somewhere? What would that journey look like?

For the second example, would it sound accurate? Would it provide the mood of the original prompt?

We have lots of fun experiments ahead of us with these new generative tools.

Captioning AI-Generated Images

Written by Leor Grebler

No responses yet