I love advanced voice mode on ChatGPT, but I often feel like the interface leaves something to be desired. When asking about things or places, I wish it would pull up visuals for me, organize my to-do list, or help plan a trip. For long conversations, it's also easy to forget the answer to an earlier question, with no way to see it.
As part of an OpenAI hackathon, I prototyped a voice-first interface that could dynamically create visuals, using Google's Nano Banana image gen model. I was amazed at the results. Every image in this mockup was generated by Nano Banana, with a heavy system prompt, based on a question.
If you have a follow-up question, it gets added to the flow of conversation, making it easy to just swipe down and reference the answer to a question any time.