You’re Probably Underestimating AI Chatbots

In the spring In 2007, I was one of four journalists appointed by Steve Jobs to review the iPhone. This was probably the most expected product in the history of technology. How would it be? Was it a turning point for devices? looking back my review Today, I’m relieved to say that it’s not a shame: I recognized the generational importance of the device. But for all the praise I’ve heaped on the iPhone, I couldn’t anticipate its mind-blowing side effects, like the volcanic meltdown of hardware, OS, and apps, or its mesmerizing effect on our attention. (I urged Apple to “encourage third-party developers to create new uses” for the device.) Nor did I suggest that we should wait for the rise of services like Uber or TikTok or make any predictions that family dinners would turn into visualization-focused community trances. . Of course, my main job was to help people decide whether to spend $500, which was very expensive for a phone at the time, to buy the damn thing. But reading the review now, one might wonder why I spent time complaining about AT&T’s network or the web browser’s inability to handle Flash content. That’s like arguing over which sandals to wear just as a three-story tsunami is about to hit.

I am reminded of my lack of foresight when reading about the experiences people are having with recent AI applications, such as big language model chatbots and AI imagers. People are rightly obsessed with the impact of a sudden cavalcade of startlingly capable AI systems, though scientists often point out that these seemingly rapid advances have been decades in the making. But like when I first used the iPhone in 2007, we risk not anticipating the potential trajectories of our AI-infused future by focusing too much on current versions of products like Microsoft’s Bing Chat, OpenAI’s ChatGPT, Anthropic’s Claude and Bard from Google.

This fallacy can be clearly seen in what has become a new and popular media genre, best described as notice and pronunciation. The modus operandi is to attempt some task previously limited to humans and then, often without regard to caveats provided by the inventors, take it to the extreme. The great sportswriter Red Smith once said that writing a column is easy: You just open a vein and it bleeds. But would-be experts now promote a bloodless version: Just open a browser and ask. (Note: this newsletter was produced the old-fashioned way, by opening a vein.)

Typically, the notice and pronunciation columns involve sitting down with one of these early systems and seeing how well it replaces something that was previously limited to the realm of the human. In a typical example, a New York Times reporter used ChatGPT to answer all your work communications for a whole week. The Wall Street JournalThe product reviewer decided clone your voice (hey, we did that first!) and appearance using AI to see if their algorithmic doppelgängers could trick people into mistaking fake for real. There are dozens of similar examples.

In general, those who ride such stunts come to two conclusions: These models are awesome, but they fall miserably short of what humans do best. Emails don’t capture the nuances of the workplace. The clones have a dragging foot in the uncanny valley. Most damningly, these text generators make things up when asked for factual information, a phenomenon known as “hallucinations” that is the current bane of AI. And it’s a self-evident fact that the output of today’s models often has a soulless quality.

In a sense, it’s scary: will our future world be run by defectives?children of the mind”, what does robotics specialist Hans Moravec call our digital successors? But in another sense, the shortcomings are comforting. Sure, AIs can now perform many low-level tasks and are unparalleled in suggesting plausible-looking Disneyland trips and gluten-free dinner menus, but bots, it is thought, will always need us to make corrections and enliven prose.


Scroll to Top