The case of harmful explainable AI

Introduction

Design for trust is in the User Experience world an important guideline. This is because trust is an important indicator of technology adoption (Bahmanziari et al., 2003 ). Similarly, the role and importance of trust for AI is receiving increasing attention.

For instance, for the Europe Commission, the mantra is “Trustworthy AI”. Policy and a yearly 1 billion investment is designed around the idea that AI needs to be developed such that is worthy of our trust. Technical factors such as robustness, safety and security but also more “soft topics” such as respectful towards autonomy, transparency, user control, respectful to human rights and accountability are equally part of the considerations.

This is Europe’s perspective of conducts “good AI”. In this perspective, transparency plays a central theme. The research field of Explainable AI is currently exploding with research on how to achieve this for the blackboxes produced by AI/ML. A typical approach is to guide a model prediction with a human-interpretable explanation. Explanations can be useful for developers and certifiers but can also be used for end-users who operate directly with an AI-system.

There is a large body of research concerned with the creation of explanation techniques (this is not easy!). On the other hand, there is less focus on what users actually want to have explained and what happens when users are confronted with them.

In this article, I present the result of a quantitative user study I conducted where users had to collaborate with and AI and had to interpret explanations. This study was done in light of my Master Thesis on Artificial intelligence.

The results indicate that explanations are a double-edge sword: they are “desirable” by the user, lead to more self-reported trust in the AI-system, but also leads to a larger task error rate.

Automation Bias

Before the results and the experiment are presented, the theory behind automation bias needs to be explained. I’ll keep it brief.

Automation bias applies whenever an imperfect “AI” is applied to a semi-automated setting. In more simpler terms: a user is using an AI to make a decision, but unfortunately the AI is not perfect and makes mistakes. A bit like a navigation app: it tends to work, but not always, and you still need to make your own decisions while driving. Keep this example in mind.

In such conditions, two type of errors can be made. Either the user has little trust in the AI (for instance a taxi driver preferring his own knowledge). This can lead to errors because the user ignores perfectly fine advice in favour of wrong self made decisions. Or, the user can have too much trust (for instance when I blindly instructions) and just follow wrong advice. This is self-reliance vs over-reliance. A good introduction to automation bias and automation psychology is Parasuraman & Manzey (2010)

Automation bias leads to two type of errors that can be predicted by the amount of trust a user has in the system.

Explainable AI fits really well in this theory! The hypothesised role is that seeing a prediction together with an explanation leads to higher trust. Higher trust leads to less self-reliance and its associated human error. On the other hand, as explanations allow me to inspect the prediction, I do not solely need to trust a prediction, leading to less over-reliance.

Explanations and its effect on automation bias errors.

User study

I created a user study to test what actually happens when users are confronted with explanations. Does it increase trust? Do they begin to inspect and question predictions like the hypothesis suggests?

Automatic Short Answer Grading is the umbrella term for AI-technologies that grade open questions. This is pretty cool technology because it helps to further democratise education to all areas of the world. It’s not perfect though, it still makes mistakes (around 80% accuracy) but you have to keep in mind that also humans teachers make mistakes and are often inconsistent.

It makes sense though to not deploy this tech in an automatic fashion but in a “human-in-the-loop” situation. I.e., a teacher is using it to speed up the grading process.

So, here we have a situation that is exactly captured by the theory. Imperfect AI with a human using the information to make a decision.

In the user study, a participant had to creep in the role of a teacher and grade questions. They were supported with an AI-Assistent.

Screenshot from the user test. In this image the baseline condition is shown without any explanation.

I designed and created 4 different types of explanation that can be guided by such a grading AI. These were mostly informed by academic research and one I designed on my own. These were compared against a baseline in which there was no explanation at all.

This explanation shows certainty based on comparable answers.

Certainty and similar comparable answers are shown.

The keywords relevant for the prediction are highlighted.

The part of the sentence is marked that relates to part of the model answer.

Participants had to grade a couple of questions. I purposefully injected “red herrings”: here the AI-system makes a mistake and assigns points to an answer where none are applicable.

Statistical analysis

I conducted several statistical non-parametric tests (mostly Friedmans test). In total 26 participants completed the test and it took them around 20–30 minutes. I’ll leave it at this.

If you are curious, I made the analysis in a Google Colab .

Results of the user study

The statistical tests found a couple of significant relations with strong effect size. I will immediately jump to a cohesive conclusion without going in too much detail on every statistical correlation.

After trying out all the concepts, participants self-reported in the survey they prefer an AI which can be verified with an explanation. They also self-reported that having an explanation leads to an increase in trust in the AI-system. They also state that if an explanation is shown, they are more likely to orient themselves towards the AI-system.

Thus, explanations are desirable from the user and self-reportedly lead to more trust and usage of the AI-system.

This is also visible from the ratings: the most preferred condition was the case with explanation concept 4.

This condition was rated significantly higher (M = 7.16, SD = 2.3) than the baseline without any explanation (M=4.88, SD = 2.5) on a scale from 0 to 10. People really liked this concept!

However, users also made significantly more grading errors when using concept 4. Especially, when the AI made a wrong prediction it turned out users followed its advice. They became less critical due to the explanation.

Thus, rather than using the explanation to monitor the AI, users will begin to follow its advice even if it is wrong. An over-reliance is developed. Using concept 4 led to a worse grading than without any explanations.

Interpretation of results

The theory of automation bias explains this result: too much trust is created. This does not lead to a more critical inspect but to the opposite: blindly following of advice.

My results are not isolated. A study by Jacobs et al. (2021) published in Nature in 2021 examined 220 clinicians that were supported with an AI-system with and without explanations. They also found that tasks errors drop whenever the model made a wrong prediction and an explanation is available.

These experiment suggest that transparency can be harmful. And now we have a European Commission that invests billions every year in trustworthy AI, which emphasises transparency and trust.

There is a fine balance to be maintained between not enough trust and too much trust. I do not believe my or the study from Jacobs et al. invalidate all research efforts on Explainable AI. They do underpin however the importance of considering the human factors involved.

Semi-automatic situations are not new, they existed for instance in airplane cockpits or control towers much before the rise of deep learning. There is a large body of research existent that must be re-examined in create effective explainable AI.

How an AI-system will be eventually used and interpreted by its human user is just as critical as the technical construction of AI-systems themselves. Trust plays a central theme, but it can be a double-edged sword in the case of AI-systems. I hope fellow User Experience colleagues will dive into the theory of Automation Bias and discover how this can inform the design of well adopted AI-systems.

References

Bahmanziari, T., Pearson, J. M., & Crosby, L. (2003). Is Trust Important in Technology Adoption? A Policy Capturing Approach. Journal of Computer Information Systems, 43(4), 46–54.

Jacobs, M., Pradier, M. F., McCoy, T. H., Perlis, R. H., Doshi-Velez, F., & Gajos, K. Z. (2021). How machine- learning recommendations influence clinician treatment selections: the example of the antidepressant selection. Translational Psychiatry, 11(1). https://doi.org/10.1038/s41398-021-01224- x

Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381–410. https://doi.org/10.1177/0018720810376055

‍