What Can I Do With It?
I met up with a friend who I hadn’t seen in a few years and, maybe it was him humouring me, but he mentioned that he hadn’t heard of ChatGPT and wanted to know what it could do. As I was hyperventilating, I had remembered something that he had asked many years back. He wanted to know whether the Ubi (our voice computer) could perform a task that required multiple steps.
At the time of the original question, I had smugly walked him through all the reasons why such an ask was so difficult. You’d need to do proper voice activity detection during the request so you didn’t cut off the speaker as they hummed and hawed. Then, you’d need fantastic speech transcription. Then, you’d need entity and intent recognition on multiple entities and intents. If that worked, you’d also need to have pre-built the APIs to make the appropriate, formatted calls to services that could perform the functions. [deep breath] Then, you’d need to format the response back in natural language, quoting back the appropriate elements.
Eight years later, such a request is totally possible thanks to large language models and natural language generation. We can ask complex questions and get a response back. However, since the request for work, it would still require special permissions on where the information was being sent.
Beyond his original request, ChatGPT seemed like a party trick. You could ask it some very difficult questions and get coherent responses. This indicates to me that the true value of the service is deeper than the consumer applications and will be uncovered through reducing boring repetitive tasks. While ChatGPT and other LLMs are magical, the applications that derive the most value from the services will be boring… like many lucrative services.