The advantages of self-explainable AI over interpretable AI – The Next Web

Posted: June 25, 2020 at 3:44 am


without comments

Would you trust an artificial intelligence algorithm that works eerily well, making accurate decisions 99.9% of the time, but is a mysterious black box? Every system fails every now and then, and when it does, we want explanations, especially when human lives are at stake. And a system that cant be explained cant be trusted. That is one of the problems the AI community faces as their creations become smarter and more capable of tackling complicated and critical tasks.

In the past few years,explainable artificial intelligencehas become a growing field of interest. Scientists and developers are deploying deep learning algorithms in sensitive fields such as medical imaging analysis and self-driving cars. There is concern, however, about how these AI operate. Investigating the inner-workings of deep neural networks is very difficult, and their engineers often cant determine what are the key factors that contribute to their output.

For instance, suppose a neural network has labeled the image of a skin mole as cancerous. Is it because it found malignant patterns in the mole or is it because of irrelevant elements such as image lighting, camera type, or the presence of some other artifact in the image, such aspen markings or rulers?

Researchers have developed various interpretability techniques that help investigate decisions made by variousmachine learning algorithms. But these methods are not enough to address AIs explainability problem and create trust in deep learning models, argues Daniel Elton, a scientist who researches the applications of artificial intelligence in medical imaging.

[Read: Everything you need to know about recurrent neural networks]

Elton discusses why we need to shift from techniques that interpret AI decisions to AI models that can explain their decisions by themselves as humans do. His paper, Self-explaining AI as an alternative to interpretable AI, recently published in thearXiv preprint server, expands on this idea.

Classicsymbolic AI systemsare based on manual rules created by developers. No matter how large and complex they grow, their developers can follow their behavior line by line and investigate errors down to the machine instruction where they occurred. In contrast, machine learning algorithms develop their behavior by comparing training examples and creating statistical models. As a result, their decision-making logic is often ambiguous even to their developers.

Machine learnings interpretability problem is both well-known and well-researched. In the past few years, it has drawn interest fromesteemed academic institutions and DARPA, the research arm of the Department of Defense.

Efforts in the field split into two categories in general: global explanations and local explanations. Global explanation techniques are focused on finding general interpretations of how a machine learning model works, such as which features of its input data it deems more relevant to its decisions. Local explanation techniques are focused on determining which parts of a particular input are relevant to the decision the AI model makes. For instance, they mightproduce saliency mapsof the parts of an image that have contributed to a specific decision.

Examples of saliency maps produced by RISE

All these techniques have flaws, and there is confusion regarding how to properly interpret an interpretation, Elton writes.

Elton also challenges another popular belief about deep learning. Many scientists believe that deep neural networks extract high-level features and rules from their underlying problem domain. This means that, for instance, when you train aconvolutional neural networkon many labeled images, it will tune its parameters to detect various features shared between them.

This is true, depending on what you mean by features. Theres a body of research that shows neural networks do in factlearn recurring patterns in imagesand other data types. At the same time, theres plenty of evidence thatdeep learning algorithms do not learn the general featuresof their training examples, which is why they are rigidly limited to their narrow domains.

Actually, deep neural networks are dumb- any regularities that they appear to have captured internally are solely due to the data that was fed to them, rather than a self-directed regularity extraction process, Elton writes.

Citing apaperpublished in the peer-reviewed scientific magazineNeuron, Elton posits that, in fact, deep neural networks function through the interpolation of data points, rather than extrapolation.

Some research is focused on developing interpretable AI models to replace current black boxes. These models make their reasoning logic visible and transparent to developers. In many cases, especially in deep learning, swapping an existing model for an interpretable one results in an accuracy tradeoff. This would be a self-defeating goal because we opt for more complex models because they provide higher accuracy in the first place.

Attempts to compress deep neural networks into simpler interpretable models with equivalent accuracy typically fail when working with complex real-world data such as images or human language, Elton notes.

One of Eltons main arguments is about adopting a different view of understanding AI decision. Most efforts focus on breaking open the AI black box and figuring out how it works at a very low and technical level. But when it comes to the human brain, the ultimate destination of AI research, weve never had such reservations.

The human brain also appears to be an overfit black box which performs interpolation, which means that how we understand brain function also needs to change, he writes. If evolution settled on a model (the brain) which is uninterpretable, then we expect advanced AIs to also be of that type.

What this means is that when it comes to understanding human decision, we seldom investigate neuron activations. Theres a lot ofresearch in neurosciencethat helps us better understands the workings of the brain, but for millennia, weve relied on other mechanisms to interpret human behavior.

Interestingly, although the human brain is a black box, we are able to trust each other. Part of this trust comes from our ability to explain our decision making in terms which make sense to us, Elton writes. Crucially, for trust to occur we must believe that a person is not being deliberately deceptive, and that their verbal explanations actually maps onto the processes used in their brain to arrive at their decisions.

One day, science might enable us to explain human decisions at the neuron activation level. But for the moment, most of us rely on understandable, verbal explanations of our decisions and the mechanisms we have to establish trust between each other.

The interpretation of deep learning, however, is focused on investigating activations and parameter weights instead of high-level, understandable explanations. As we try to accurately explain the details of how a deep neural network interpolates, we move further from what may be considered relevant to the user, Elton writes.

Based on the trust and explanation model that exists between humans, Elton calls for self-explaining AI that, like a human, can explain its decision.

An explainable AI yields two pieces of information: its decision and the explanation of that decision.

This is an idea that has been proposed and explored before. However, what Elton proposes is self-explaining AI that still maintains its complexity (e.g., deep neural networks with many layers) and does not sacrifice its accuracy for the sake of explainability.

In the paper, Elton suggests how relevant causal information can be extracted from a neural network. While the details are a bit technical, what the technique basically does is extract meaningful and present information from the neural networks layers while avoiding spurious correlations. His method builds on current self-explaining AI systems developed by other researchers and verifies whether explanations and predictions in their neural networks correspond.

Structure of self-explainable AI (source: arxiv.org)

In his paper, Elton also discusses the need to specify the limits of AI algorithms. Neural networks tend to provide an output value for any input they receive. Self-explainable AI models should send an alert when results fall outside the models applicability domain, Elton says. Applicability domain analysis can be framed as a simple form of AI self-awareness, which is thought by some to be an important component for AI safety in advanced AIs.

Self-explainable AI models should provide confidence levels for both their output and their explanation.

Applicability and domain analysis is especially important for AI systems where robustness and trust are important, so that systems can alert their user if they are asked work outside their domain of applicability, Elton concludes. An obvious example would be health care, where errors can result in irreparable damage to health. But there are plenty of other areas such asbanking, loans, recruitment, and criminal justice, where we need to know the limits and boundaries of our AI systems.

Much of this is still hypothetical, and Elton provides little in terms of implementation details, but it is a nice direction to follow as the explainable AI landscape develops.

See the rest here:
The advantages of self-explainable AI over interpretable AI - The Next Web

Related Post

Written by admin |

June 25th, 2020 at 3:44 am

Posted in Self-Awareness