Explaining machine learning models to the business

Explainable machine learning is a sub-discipline of artificial intelligence (AI) and machine learning that attempts to summarize how machine learning systems make decisions. Summarizing how machine learning systems make decisions can be helpful for a lot of reasons, like finding data-driven insights, uncovering problems in machine learning systems, facilitating regulatory compliance, and enabling users to appeal — or operators to override — inevitable wrong decisions.

Of course all that sounds great, but explainable machine learning is not yet a perfect science. The reality is there are two major issues with explainable machine learning to keep in mind:

  1. Some “black-box” machine learning systems are probably just too complex to be accurately summarized.
  2. Even for machine learning systems that are designed to be interpretable, sometimes the way summary information is presented is still too complicated for business people. (Figure 1 provides an example of machine learning explanations for data scientists.)
h2o explainable ml 01 H2O.ai

Figure 1: Explanations created by H2O Driverless AI. These explanations are probably better suited for data scientists than for business users.

For issue 1, I’m going to assume that you want to use one of the many kinds of “glass-box” accurate and interpretable machine learning models available today, like monotonic gradient boosting machines in the open source frameworks h2o-3, LightGBM, and XGBoost.1 This article focuses on issue 2 and helping you communicate explainable machine learning results clearly to business decision-makers.

This article is divided into two main parts. The first part addresses explainable machine learning summaries for a machine learning system and an entire dataset (i.e. “global” explanations). The second part of the article discusses summaries for machine learning system decisions about specific people in a dataset (i.e. “local” explanations). Also, I’ll be using a straightforward example problem about predicting credit card payments to present concrete examples.

General summaries

Two good ways, among many other options, to summarize a machine learning system for a group of customers, represented by an entire dataset, are variable importance charts and surrogate decision trees. Now, because I want business people to care about and understand my results, I’m going to call those two things a “main drivers chart” and a “decision flowchart,” respectively.

Main decision drivers

The main drivers chart provides a visual summary and ranking of which factors are most important to a machine learning system’s decisions, in general. It’s a high-level summary and decent place to start communicating about how a machine learning system works.

Copyright © 2020 IDG Communications, Inc.

Source link