Years before ChatGPT, one of the creators of IBM’s Watson tried to harness AI to tutor—here’s why it didn’t work

January 22, 2024

When Satya Nitta worked at IBM, he and a team of colleagues took on a bold assignment: Use the latest in artificial intelligence to build a new kind of personal digital tutor.

This was before ChatGPT existed, and fewer people were talking about the wonders of AI. But Nitta was working with what was perhaps the highest-profile AI system at the time, IBM’s Watson. That AI tool had pulled off some big wins, including beating humans on Jeopardy in 2011.

Nitta says he was optimistic that Watson could power a generalized tutor, but he knew the task would be extremely difficult. “I remember telling IBM top brass that this is going to be a 25-year journey,” he recently told EdSurge.

He says his team spent about five years trying, and along the way they helped build some small-scale attempts into learning products, such as a pilot chatbot assistant that was part of a Pearson online psychology courseware system in 2018.

Why AI won’t be a general personal tutor for decades (if ever)

But in the end, Nitta decided that even though the generative AI technology driving excitement these days brings new capabilities that will change education and other fields, the tech just isn’t up to delivering on becoming a generalized personal tutor, and won’t be for decades at least, if ever.

“We’ll have flying cars before we will have AI tutors,” he says. “It is a deeply human process that AI is hopelessly incable of meeting in a meaningful way. It’s like being a therapist or like being a nurse.”

Instead, he cofounded a new AI company, called Merlyn Mind, that is building other types of AI-powered tools for educators.

“The biggest positive transformation that education has ever seen”

Meanwhile, plenty of companies and education leaders these days are hard at work chasing that dream of building AI tutors. Even a recent White House executive order seeks to help the cause.

Earlier this month, Sal Khan, leader of the nonprofit Khan Academy, told the New York Times: “We’re at the cusp of using A.I. for probably the biggest positive transformation that education has ever seen. And the way we’re going to do that is by giving every student on the planet an artificially intelligent but amazing personal tutor.”

Khan Academy has been one of the first organizations to use ChatGPT to try to develop such a tutor, which it calls Khanmigo, that is currently in a pilot phase in a series of schools.

Khan’s system does come with an off-putting warning, though, noting that it “makes mistakes sometimes.” The warning is necessary because all of the latest AI chatbots suffer from what are known as “hallucinations”—the word used to describe situations when the chatbot simply fabricates details when it doesn’t know the answer to a question asked by a user.

AI experts are busy trying to offset the hallucination problem, and one of the most promising approaches so far is to bring in a separate AI chatbot to check the results of a system like ChatGPT to see if it has likely made up details. That’s what researchers at Georgia Tech have been trying, for instance, hoping that its system can get to the point where any false information is scrubbed from an answer before it is shown to a student. But it’s not yet clear that approach can get to a level of accuracy that educators will accept.

At this critical point in the development of new AI tools, though, it’s useful to ask whether a chatbot tutor is the right goal for developers to head toward. Or is there a better metaphor than “tutor” for what generative AI can do to help students and teachers?

An ‘Always-On Helper’ vs. a “a robot that can read your mind”

Michael Feldstein spends a lot of time experimenting with chatbots these days. He’s a longtime edtech consultant and blogger, and in the past he wasn’t shy about calling out what he saw as excessive hype by companies selling edtech tools.

In 2015, he famously criticized promises about what was then the latest in AI for education—a tool from a company called Knewton. The CEO of Knewton, Jose Ferreira, said his product would be “like a robot tutor in the sky that can semi-read your mind and figure out what your strengths and weaknesses are, down to the percentile.” This led Feldstein to respond that the CEO was “selling snake oil” because, Feldstein argued, the tool was nowhere near to living up to that promise. (The assets of Knewton were quietly sold off a few years later.)

So what does Feldstein think of the latest promises by AI experts that effective tutors could be on the near horizon?

“ChatGPT is definitely not snake oil—far from it,” he tells EdSurge. “It is also not a robot tutor in the sky that can semi-read your mind. It has new capabilities, and we need to think about what kinds of tutoring functions today’s tech can deliver that would be useful to students.”

He does think tutoring is a useful way to view what ChatGPT and other new chatbots can do, though. And he says that comes from personal experience.

Feldstein has a relative who is battling a brain hemorrhage and has been turning to ChatGPT to give him personal lessons in understanding the medical condition and his loved-one’s prognosis. As Feldstein gets updates from friends and family on Facebook, he says, he asks questions in an ongoing thread in ChatGPT to try to better understand what’s happening.

“When I ask it in the right way, it can give me the right amount of detail about, ‘What do we know today about her chances of being OK again?’” Feldstein says. “It’s not the same as talking to a doctor, but it has tutored me in meaningful ways about a serious subject and helped me become more educated on my relative’s condition.”

While Feldstein says he would call that a tutor, he argues that it’s still important that companies not oversell the limits of their AI tools. “We’ve done a disservice to say they’re these all-knowing boxes, or they will be in a few months,” he says. “They’re tools. They’re strange tools. They misbehave in strange ways—as do people.”

He points out that even human tutors can make mistakes, but most students have a sense of what they’re getting into when they make an appointment with a human tutor.

“When you go into a tutoring center in your college, they don’t know everything. You don’t know how trained they are. There’s a chance they may tell you something that’s wrong. But you go in and get the help that you can.”

Whatever you call these new AI tools, he says, it will be useful to “have an always-on helper that you can ask questions to,” even if their results are just a starting point for more learning.

‘Boring’ but important support tasks

What are new ways that generative AI tools can be used in education, if tutoring ends up not being the right fit?

To Nitta, the stronger role is to serve as an assistant to experts rather than a replacement for an expert tutor. In other words, instead of replacing, say, a therapist, he imagines that chatbots can help a human therapist summarize and organize notes from a session with a patient.

“That’s a very helpful tool rather than an AI pretending to be a therapist,” he says. Even though that may be seen as “boring,” by some, he argues that the technology’s superpower is to “automate things that humans don’t like to do.”

In the educational context, his company is building AI tools designed to help teachers, or to help human tutors, do their jobs better. To that end, Merlyn Mind has taken the unusual step of building its own so-called large language model from scratch designed for education.

Even then, he argues that the best results come when the model is trained to support specific education domains, by being trained with vetted data sets rather than relying on ChatGPT and other mainstream tools that draw from vast amounts of information from the internet.

“What does a human tutor do well? They know the student, and they provide human motivation,” he adds. “We’re all about the AI augmenting the tutor.”

This article was syndicated from EdSurge. EdSurge is a nonprofit newsroom that covers education through original journalism and research. Sign up for their newsletters.

Jeffrey R. Young is an editor and reporter at EdSurge and host of the weekly EdSurge Podcast.

Fast Company – technology

(14)

Before ChatGPT Creators Didn't Harness IBM's Tried tutor—here's Watson Work Years

Years before ChatGPT, one of the creators of IBM’s Watson tried to harness AI to tutor—here’s why it didn’t work

Why AI won’t be a general personal tutor for decades (if ever)

“The biggest positive transformation that education has ever seen”

An ‘Always-On Helper’ vs. a “a robot that can read your mind”

‘Boring’ but important support tasks

Related