The “black box” problem
Benjamin, while doing community service you worked with children with cognitive impairments. What did you learn from this experience?
Benjamin Grewe: What became apparent is that for many impairments, things like motivation and humour are often fully intact even when higher-level functions such as understanding complex concepts are compromised. The human brain is highly complex.
Is it possible for AI to compete with human intelligence?
Grewe: When I started out in machine learning, I programmed an artificial neuronal network with the goal of reverse engineering emotional (fear) learning in humans – but the network didn’t learn well. In fact, it got depressed – everything you showed it produced the same output: fear. The idea behind AI is to copy some aspects of human intelligence, but definitely not all. We don’t want to create an intelligent car that says, “I’m too afraid to drive you today.”
Agata Ferretti: AI today is task-oriented, meaning it tries to solve a particular problem like driving a car or diagnosing a disease. Its emotional level doesn’t even come close to that of humans. Human intelligence can’t be reduced to the ability to excel in one task. In this sense, you could even say that AI is quite dumb – it’s not fit for many purposes.
Grewe: But increasingly, the goal in AI is to move beyond specialisation for one task. People are trying to develop smarter voice assistants, for example, using huge text databases from sources on the internet. The resulting algorithms can produce text that’s grammatically correct, but they don’t understand the meaning of the words they produce. They write “dog”, but have never seen or touched a dog, let alone been bitten by one.
Ferretti: We see something similar in the medical field when AI is deployed to recognise from images what’s likely to be cancer. Whereas doctors base their assessment on their medical knowledge and experience, AI refers to things like the light or edges in the picture – aspects that are relevant for identifying patterns but don’t always have clinical significance for doctors. The validity of the correlation is different for the doctor and the machine.
Grewe: Yes, this is an important point. For example, in adversarial attacks, researchers try to trick deep artificial networks. They show a picture of a dog, then change three special pixels and the network predicts that it’s a cat. This would never fool a human.
Mistaking a dog for a cat sounds funny, but obviously the stakes are a bit higher when diagnosing cancer…
Ferretti: Indeed, and yet trust is an issue even when the system works correctly – patients may not trust the results if they don’t understand the reasoning behind them. Both doctors and patients will trust a system more once it has proved itself to be reliable and they see that there’s a culture of openness about its implications. A commitment to users’ rights to an explanation and a certain degree of transparency would boost trust in these systems, and hence their usability.
But today even scientists admit that there are systems where we don’t really know what and how they’re really learning...
Grewe. This points to a bigger problem in machine learning. Until recently, researchers would train a robot by writing a line of command code such as: “to grab this cup, move your hand to the right and close your hand at position XY.” They knew exactly what the robot was doing. Now they just feed the robot a lot of data, it tries out many movements, and when it grabs the cup they say, “That was good, do it again.” So we’re moving away from engineering where we understand every step in the process and moving towards just letting the algorithms learn what we want them to do. But it really is a black box. No one understands how these algorithms work – and the biggest problem is that they sometimes fail, and we don’t know why.
So what we need is interpretable machine learning, where transparency is built in right from the start?
Ferretti: Yes, some degree of interpretability could be useful. It might also increase accountability if something goes wrong. If you use this technology in health care, for example, it’s important to say who’d be liable for a wrong diagnosis: Is the misdiagnosis due to a doctor’s error or flawed logic in the AI system?
Grewe: It’s vital for engineers and industries to understand why and when AI systems make errors. If a human makes a decision, we can ask them about their reasons. We can’t yet do this with a machine learning algorithm.
Ferretti: We also need to discuss what type of data we feed these machines with. If we start from the assumption that our world is full of biases and injustices, we run the risk that the unsupervised machine will reproduce these limitations. What’s more, selective biases in the data could lead to discrimination. For example, if you feed the machine with more high-quality data of tumours on light skin, the system probably won’t recognise a tumour on dark skin. Another very important ethical principle is guaranteeing fairness. These systems should be rigorously tested to ensure the data are reliable and unwanted biases are mitigated.
How can these kinds of ethical standards be enforced?
Ferretti: It’s difficult. In our lab, we’ve talked about developing quality assurance systems and frameworks that can be used to test the technologies. The ethical and legal tools used so far in medical research must be adapted to address the new issues of AI algorithms. The challenge is this: how can we develop a system that can keep pace with evaluating and monitoring these fast-evolving technologies?
Do we need new guidelines?
Ferretti: We need to clarify how to interpret and implement the ethical principles that guide AI development. Although nowadays there are plenty of ethical guidelines for AI, there’s uncertainty about how to integrate the views of various stakeholders. At the same time, there are stringent guidelines for using sensitive data like medical data collected in hospitals, but not for data recorded on social media or in fitness apps, which could be used for similar purposes. So how do you manage the mixture of these data? We need a broader governance framework that can ensure data protection, guarantee fairness, promote transparency and also monitor how the tech evolves.
Tech companies have much more computing power than universities. Does this restrict you as a researcher?
Grewe: In certain areas, such as language modelling, this is already a problem because universities aren’t competitive. These models are trained using text drawn from the whole internet, with millions of dollars spent on computing resources. Well, I haven’t read the whole internet, but I hope that I’m in some sense smarter than these models. At some point, even leveraging statistics and big data may reach its limits. In my opinion, we need a fundamentally different concept of learning to generate algorithms with problem- solving skills that are more robust and universal.
Different in what way?
Grewe: It may well be time to move away from statistical learning and big data and start learning more like children. I’m thinking here of embodied systems that start by learning basic things to build very simple abstract concepts. Based on these, they could learn more and more complex interactions and schemata. Basically we need to “grow’’ AI step-by-step. If human-like intelligence is our goal, then we need to implement this kind of developmental approach. In addition, we need to carry out algorithm research in a much more interdisciplinary fashion – combining, for example, machine learning with robotics, neuroscience and psychology.
Agata, do you think this approach would lead to more or fewer ethical problems with AI?
Ferretti: Eventually more, but I wonder what kind of timeframe we’re looking at here. For the time being, task-oriented tools that may simplify and improve people’s lives are what we have to deal with. The ethical issues with these tools are already challenging enough, but the future is going to be exciting!
This text has been published in the current issue of the Globe magazine.