Can we create explanations of artificial intelligence and machine learning that have some level of consistency over time as we might expect of a human explanation? This paper explores this issue, and offers several strategies for either maintaining a level of consistency or highlighting when and why past explanations might appear inconsistent with current decisions.
References
- A. Dix, Human issues in the use of pattern recognition techniques, in: R. Beale, J. Finlay (Eds.),
Neural Networks and Pattern Recognition in Human Computer Interaction, Ellis Horwood, 1992,
pp. 429–451. https://alandix.com/academic/papers/neuro92/.
- K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image
classification models and saliency maps, arXiv preprint arXiv:1312.6034 (2013). (ICLR 2014,
Workshop Poster).
- S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, in: Proceedings of
the 31st International Conference on Neural Information Processing Systems, NIPS’17, Curran
Associates Inc., Red Hook, NY, USA, 2017, p. 4768–4777.
- M. T. Ribeiro, S. Singh, C. Guestrin, “why should I trust you?”: Explaining the predictions of any
classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD ’16), Association for Computing Machinery, New York, NY,
USA, 2016, p. 1135–1144. doi:10.1145/2939672.2939778.
- M. Setzu, R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, F. Giannotti, GLocalX – from local to
global explanations of black box AI models, Artificial Intelligence 294 (2021) 103457.
- F. Doshi-Velez, B. Kim, Towards a rigorous science of interpretable machine learning, arXiv
preprint arXiv:1702.08608 (2017).
- R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for
explaining black box models, ACM Computing Surveys 51 (2018) 1–42.
- A. Abdul, J. Vermeulen, D. Wang, B. Y. Lim, M. Kankanhalli, Trends and trajectories for explainable,
accountable and intelligible systems: An HCI research agenda, in: Proceedings of the 2018 CHI
conference on human factors in computing systems, 2018, pp. 1–18.
- P. P. Angelov, E. A. Soares, R. Jiang, N. I. Arnold, P. M. Atkinson, Explainable artificial intelligence:
an analytical review, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 11
(2021) e1424.
- X. Wang, M. Yin, Are explanations helpful? a comparative study of the effects of explanations in
ai-assisted decision-making, in: Proceedings of the 26th International Conference on Intelligent
User Interfaces, 2021, pp. 318–328.
- V. Hassija, V. Chamola, A. Mahapatra, A. Singal, D. Goel, K. Huang, S. Scardapane, I. Spinelli,
M. Mahmud, A. Hussain, Interpreting black-box models: a review on explainable artificial intelligence,
Cognitive Computation 16 (2024) 45–74. doi:10.1007/s12559-023-10179-8.
- H. Vainio-Pekka, M. O.-O. Agbese, M. Jantunen, V. Vakkuri, T. Mikkonen, R. Rousi, P. Abrahamsson,
The role of explainable AI in the research field of ai ethics, ACM Transactions on Interactive
Intelligent Systems 13 (2023) 1–39. doi:10.1145/3599974.
- S. Myers, N. Chater, Mutual understanding initial theory, TANGO Deliverable D1.1, 2024.
- S. Myers, N. Chater, Interactive explainability: Black boxes, mutual understanding and what it
would really mean for AI systems to be as explainable as people, 2024. doi:10.31234/osf.io/ha37x.
- R. Guidotti, A. Monreale, S. Ruggieri, F. Naretto, F. Turini, D. Pedreschi, F. Giannotti, Stable and
actionable explanations of black-box models through factual and counterfactual rules, Data Mining
and Knowledge Discovery 38 (2024) 2825–2862.
- F. Gawantka, F. Just, M. Savelyeva, M. Wappler, J. Lässig, A novel metric for evaluating the stability
of XAI explanations, Advances in Science, Technology and Engineering Systems Journal 9 (2024)
133–142. doi:10.25046/aj090113.
- F. Mazzoni, R. Guidotti, A. Malizia, A Frank System for Co-Evolutionary Hybrid Decision-Making,
in: International Symposium on Intelligent Data Analysis, Springer, 2024, pp. 236–248.
- A. Monreale, S. Teso, Cognition-aware explanations for HML, TANGO Deliverable D2.1, 2024.
- R. Guidotti, Counterfactual explanations and how to find them: literature review and benchmarking,
Data Mining and Knowledge Discovery 38 (2024) 2770–2824.
- A. Dix, Interactive querying-locating and discovering information, in: Second Workshop on
Information Retrieval and Human Computer Interaction, Glasgow, 11th September 1998, 1998.
https://www.alandix.com/academic/papers/IQ98/.
- F. Naretto, Explainable AI methods and their interplay with privacy protection, Ph.D. thesis, Scuola
Normale Superiore, 2023. https://ricerca.sns.it/handle/11384/133984.
- J. Chua, E. Rees, H. Batra, S. R. Bowman, J. Michael, E. Perez, M. Turpin, Bias-augmented consistency
training reduces biased reasoning in chain-of-thought, 2024. doi:10.48550/arXiv.2403.05518.
- J. Liu, A. Jain, S. Takuri, S. Vege, A. Akalin, K. Zhu, S. O’Brien, V. Sharma, TRUTH DECAY:
Quantifying multi-turn sycophancy in language models, 2025. doi:10.48550/arXiv.2503.11656.
- Q. Xie, Z. Wang, Y. Feng, R. Xia, Ask again, then fail: Large language models’ vacillations in
judgment, 2024. arXiv:2310.02174.
- B. Wang, X. Yue, H. Sun, Can ChatGPT defend its belief in truth? evaluating LLM reasoning via
debate, 2023. arXiv:2305.13160.
- W. Chen, Z. Huang, L. Xie, B. Lin, H. Li, L. Lu, X. Tian, D. Cai, Y. Zhang, W. Wang, X. Shen, J. Ye,
From yes-men to truth-tellers: Addressing sycophancy in large language models with pinpoint
tuning, 2024. doi:10.48550/arXiv.2409.01658.
- A. Fanous, J. Goldberg, A. A. Agarwal, J. Lin, A. Zhou, R. Daneshjou, O. Koyejo, SycEval: Evaluating
LLM sycophancy, 2025. doi:10.48550/arXiv.2502.08177.
- S. Chern, Z. Hu, Y. Yang, E. Chern, Y. Guo, J. Jin, B. Wang, P. Liu, Behonest: Benchmarking honesty
in large language models, 2024. arXiv:2406.13261.
- A. Dix, T. Turchi, B. Wilson, A. Monreale, M. Roach, Talking Back – human input and explanations
to interactive AI systems, in: Workshop on Adaptive eXplainable AI (AXAI), IUI 2025, Cagliari,
Italy, 24th March 2025, 2025. https://alandix.com/academic/papers/AXAI2025-talking-back/.
|
 |
 TANGO – it takes two to tango – a synergistic approach to human-machine decision making. An EU Horizon funded project

Figure 1: Types of incoherence [zoom image]

Figure 3: Ensuring consistency with previous explanations [zoom image]

Figure 4: Patch model with important issues [zoom image]
|