Two weeks in a row, ChatGPT botched my grocery list. I thought that I had found a really solid, practical use for AI—automating one of my least favorite Sunday chores—but the bot turned out to be pretty darn bad at it. I fed it a link to a recipe for cauliflower shawarma with a spicy sauce and asked it to compile the ingredients in a list. It forgot the pita, so I forgot the pita, and then I had to use tortillas instead. The following week, I gave it a link to a taco recipe. It forgot the tortillas.
How is AI going to revolutionize the world if it can’t even revolutionize my groceries? I vented to my colleague Derek Thompson, who’s written about the technology and its potential. He told me that he’d been using ChatGPT in almost the reverse way, by offering it cocktail ingredients he already had in his pantry and asking for drink recipes. I decided to give it a go, and soon enough I was sipping a pleasant mocktail made with jalapeño and seltzer.
The AI—at least in its free iteration—was pretty bad at gathering information from a random website link in an orderly fashion, but it did a good job playing with the ingredients that I provided. It is adept at a kind of creative synthesis—picking up on associations between words and pairing them in both familiar and novel ways to delight the user. Understanding why could give us a richer sense of how to deploy generative AI moving forward—and help us avoid putting it to wrongheaded, even harmful uses.
In addition to being a dismal grocery shopper, ChatGPT has struggled in the past to do basic math. We think of computers as logical and exacting, but ChatGPT is something different: a large language model that has been trained on big chunks of the internet to create associations between words, which it then “speaks” back to you. It may have read the encyclopedia, but it is not itself an encyclopedia. The program is less concerned with things being true or false; instead, it analyzes large amounts of information and provides answers that are highly probable based on our language patterns.
Some stochasticity or randomness—what the computer scientist Stephen Wolfram calls “voodoo”—is built into the model. Rather than always generating results based on what is most likely to be accurate, which would be pretty boring and predictable by definition, ChatGPT will sometimes choose a less obvious bent, something that is associated with the prompt but statistically less likely to come up. It will tell you that the word pours finishes the idiom beginning with “When it rains, it …” But if you push it to come up with other options, it may suggest “When it rains, it drizzles” or “When it rains, it storms.” As Kathleen Creel, a professor of philosophy and computer science at Northeastern University, put it: “When you give it a prompt, it says, Okay, based on this prompt … this word is 60 percent most likely to be a good word to go next, and this word is 20 percent, and this word is 5 percent.” Sometimes that less likely option is inaccurate or problematic in some way, leading to the popular criticism that large language models are “stochastic parrots”: able to piece together words but ignorant of meaning. Any given chatbot’s randomness can be dialed up or dialed down by its creator.
“It’s actually not in the business of doing something exactly,” Daniel Rockmore, a professor of math and computer science at Dartmouth College, told me. “It’s really in the business of doing something that’s super likely.” That distinction is meaningful, even in the realm of routine chores: Giving me a grocery list that is likely to be right isn’t the same as giving me a grocery list that includes everything I need. But when it comes to putting together a mixed drink based on a set of given ingredients, there isn’t necessarily one right way to do things. “You can get a shitty cocktail, but you kind of can’t get a wrong cocktail,” Rockmore pointed out.
As if to test Rockmore’s theory, Axelrad, a beer garden in Houston, recently ran a special called “Humans vs. Machines,” which pitted ChatGPT recipes against those constructed by human mixologists. The bar prompted ChatGPT to design a cocktail—for example, one inspired by the Legend of Zelda video-game series—and then tested it against one made by a bartender. Patrons could try each concoction and vote for their favorite. The bar ran the competition four times, and the robots and humans ended up tied. ChatGPT’s remake of Axelrad’s signature Blackberry Bramble Jam even triumphed over the original.
Lui Fernandes, a restaurant and bar owner who runs a YouTube channel about cocktail making, has likewise been toying with the technology. He told me that ChatGPT’s recipes are “actually very, very good,” though far from flawless. When he started pushing the limits of conventional ingredients, it “spit out some crazy recipes” that he would then have to adjust. Similarly, when my editor offered ChatGPT an objectively awful list of potential ingredients—Aperol, gin, half a beer, and a sack of frozen peas—it suggested he make a “Beer-Gin Spritz” with a garnish of frozen peas for a “fun and unexpected touch.” (You can always count on editors to attempt to break your story.) ChatGPT may understand based on its training data that vegetables can sometimes work as a drink garnish, like celery in a Bloody Mary, but it couldn’t understand why peas would be an odd choice—even if the drink itself was odd, too.
“Every now and again, it’s gonna throw up something which is totally disgusting that it somehow thinks is an extension of the things we like,” Marcus du Sautoy, a mathematician and professor at the University of Oxford, told me. Other times its choices might inspire us, like in the case of the Blackberry Bramble Jam. It is also, I should say, excellent at writing original recipes for classic and familiar drinks, having read and synthesized countless cocktail recipes itself.
What we’re basically talking about here is creativity. When humans make art, they remix what they know and toy with boundaries. Cocktails are more art than anything else: There are recipes for specific drinks, but they always boil down to taste. In this simple, low-stakes context, ChatGPT’s creative synthesis can help us find an unexpected solution to a quotidian problem.
But this creativity has limits. Giorgio Franceschelli, a Ph.D. student in computer science and engineering at the University of Bologna who conducted a study on these models’ imaginative potential, argued over email that the technology is inherently restricted, because it leans on existing material. It cannot achieve transformational creativity, “where ideas currently inconceivable … are actually made true.”
Although ChatGPT may help us explore our own creativity, it also risks flattening what we produce. Creel warned about the “cultural homogeneity” of cocktail recipes produced by the bot. Similar to how recommendation algorithms have arguably homogenized popular music, chatbots could condense the cocktail scene to one that just plays the hits over and over. And because of how they were trained, AI tools may disproportionately offer the preferences of the English-speaking internet. Fernandes—a Brazilian immigrant who, in tribute to his heritage, chooses to focus on South and Latin American spirits that other bars may overlook—found that the bot struggled to balance cachaça or pisco cocktails. “It wasn’t actually able to give me as good of a recipe as when I asked it about bourbon, rye, or gin,” he said. If we’re not thoughtful about how we use AI, it could lead us toward a monoculture beyond just our bars.
Technology experts and bartenders alike told me that we should think of AI-generated cocktail recipes as a first draft, not a final product. They encouraged a feedback loop between human and bot, to work with it to home in on what you want.
And this advice expands beyond cocktails. Rockmore proposed treating its responses as “a suggestion from someone that you don’t really know but you think is smart” rather than considering the tool to be “the all-knowing master oracle that has to be followed.”
Too often, it seems, we’re turning to AI chatbots for answers, when perhaps we should be thinking of them as unreliable—but fun and well-read—collaborators. Sure, they’ve yet to save me any time when it comes to things that I need done precisely. But they do make a nice spicy margarita.