In applying my own advice, I found that when asking ChatGPT to create three versions and then assess them, version 3 is always the best when it’s being asked to rate its own performance as it’s producing a prompt sequentially. Interesting.
This is almost a very human way of doing something and shows a bias when assessing produced work. Let’s say we’re asked to generate a list of three ideas for an amazing can opener. After each idea, we need to grade it on a ranking of 0 to 10. We are fairly biased, so we’d come up with one concept and give it an 8/10. Will we try to immediately generate a better idea? And then a better idea after that?
This seems to be what ChatGPT did when I asked it to develop and then rate three ideas. Since it evaluated the ideas one at a time as they were being generated, it rated the first one 8/10, the second 9/10, and the third — you guessed it — 10/10. It did this time after time in evaluating how it wrote it’s own prompts.
It seems like the solution is to give some distance from the creation of an idea to the evaluation of it. “Write drunk, edit sober” goes the adage. Seems like ChatGPT needs a bit of that direction too.