Show me the money (and the explanation): eXplainable AI in finance

The AI assurance market in the UK is experiencing rapid growth, with an estimated 524 firms generating £1.01 billion in Gross Value Added (GVA) according to market research published in the UK Government’s ‘Assuring a responsible future for AI’ report. This includes 84 specialised AI assurance companies, a significant rise from just 17 identified in 2023. These specialised firms, primarily microbusinesses and SMEs, contribute £0.36 billion GVA and stand to play a pivotal role in high-risk sectors such as financial services, life sciences, and pharmaceuticals.

The expression ‘explainable artificial intelligence’ (XAI) is often used to refer to various interpretability methods for machine learning models and their output or predictions. IBM defines XAI as “a set of processes and methods that allows human users to comprehend and trust the results and output created by machine learning algorithms”. These techniques are particularly useful in the financial services sector, where transparency and accountability are critical. 

With this post, I aim to provide an intuitive overview of model-agnostic interpretation methods that could in principle be deployed in finance for different use cases, such as credit risk analysis, algorithmic trading, and fraud detection. Model-agnostic methods do not rely on a model’s internal structure, so they can be applied to any model regardless of its architecture. These methods may help provide either ‘global’ insights into overall model behaviour or ‘local’ explanations for individual results.

Global model-agnostic Methods

Global model-agnostic methods provide insights into the overall behaviour of a model. These methods may be used to profile complex models, uncover biases, and potentially assess compliance with regulatory requirements. Below are some examples of global model-agnostic methods and their potential applications.

Partial Dependence Plots (PDPs)

PDPs illustrate the relationship between an input feature and the model’s predicted outcome, averaging out the effects of other input features. In credit risk analysis, for example, PDPs could reveal how certain borrower characteristics affect loan default predictions and, in turn, loan eligibility. Suppose a lender is developing a machine learning model to predict the likelihood of loan default for loan applicants. The model uses various borrower characteristics as input features including net worth, income, employment status, credit history, and postcode. The lender may use PDPs to understand how these features influence the model’s predictions and uncover any potential biases, particularly in relation to geographic location. To this end, a PDP may be generated as follows

  1. Choose a feature to analyse in a tabular dataset
    e.g. Select postcode as the borrower characteristic to analyse.
  2. For each unique value of the selected feature:
    e.g. For each UK postcode:
    - Set the feature’s values in the dataset to this unique value
      e.g. Set each postcode value in the dataset to this postcode, leaving all other borrower characteristics unchanged. This step can be problematic if borrower characteristics are correlated, as it may not reflect realistic borrowers.
    - Obtain the model’s prediction for this modified dataset
    e.g. Use the model to generate a loan default prediction for each entry in the dataset.
    - Average the predictions over all instances in the dataset
    e.g. Calculate the average loan default probability for the dataset.
  3. Plot the average prediction against the feature values
    e.g. Plot the average loan default probability against all UK postcodes to visualise how a borrower’s postcode affects the model’s loan default probability predictions, holding other borrower characteristics constant.

If the PDP shows significantly higher default probabilities for applicants from deprived areas, for example, this could indicate a bias in the model. Further analysis can be conducted to determine if this is due to the model over-relying on geographic location to predict outcomes. If biases are detected, the lender can choose to take steps to mitigate them, such as: adjusting or adding input features to better capture the intended risk factors, retraining the model with a different training dataset or using techniques like reweighting to offset any undue influence of geography, or implementing policies to ensure fair treatment of applicants from deprived areas.

Accumulated Local Effects (ALE) plots

ALE plots address some limitations of PDPs by accounting for feature interactions by considering the distribution of other features within specific intervals of the feature of interest. In credit risk analysis, ALE plots could help profile how different borrower characteristics interact to affect creditworthiness. Back to the loan default prediction model example, an ALE plot may be generated as follows:

  1. Divide the feature’s range into intervals
    e.g. Group UK postcodes by geographic region.
  2. Calculate the change in model predictions within each interval
    e.g. For every entry in the dataset, set the postcode value to another postcode from the same geographic region and calculate the difference in the model’s loan default prediction between the original and modified postcodes.
  3. Accumulate these changes across intervals
    e.g. Sum up the local effects of postcode changes on loan default probability across regions.
  4. Plot the accumulated changes
    e.g. Visualise how an applicant’s postcode impacts the model’s predictions with account of interactions with other borrower characteristics.

Permutation feature importance

This method measures the importance of each feature by evaluating the decrease in model performance when the feature’s values are randomly shuffled. In fraud detection, permutation feature importance could identify key indicators of fraudulent activity, helping to prioritise features for closer monitoring. In credit risk analysis, it could highlight borrower characteristics that are most critical for accurate loan default predictions. Still using the same example, here’s how the process might work:

  1. Evaluate the model’s baseline performance against ground truth
    e.g. Assess the model’s predictions using a validation dataset made up of past borrowers who have already repaid the loan in full and past borrowers who have defaulted.
  2. Randomly shuffle the values of a specific feature
    e.g. Shuffle the income values in the validation dataset.
  3. Measure the model’s performance on the shuffled data
    e.g. Evaluate the model's accuracy with shuffled income values.
  4. Calculate the performance drop as the difference between the baseline performance and the performance after shuffling
    e.g. This indicates the importance of income in the model’s prediction accuracy.

Global surrogate models

These are interpretable models (e.g. decision trees) trained to approximate the predictions of a more complex model (such as a deep neural network). For algorithmic trading, global surrogate models could make trading strategies and heuristics more transparent. In credit risk analysis, they could provide a simpler, interpretable approximation of a complex risk assessment model that makes it easier to understand and validate. Referring again to the example, a global surrogate model could be generated and used as follows:

  1. Generate predictions using the complex model
    e.g. Use the loan default prediction model to generate loan default probabilities for a dataset.
  2. Train an interpretable surrogate model on these predictions
    e.g. Train a decision tree on the dataset and the generated predictions.
  3. Assess the surrogate model’s performance
    e.g. Check how well the trained decision tree approximates the original model’s behaviour.
  4. Interpret the surrogate model
    e.g. Analyse the internal structure of the decision tree to understand how borrower characteristics (e.g. income, postcode) influence the model’s predictions.

Prototypes and criticisms

Prototypes are typical or representative examples of a certain class or characteristic, while criticisms are examples that are not well-captured by the model. In fraud detection, prototypes may represent typical fraudulent and non-fraudulent transactions, while criticisms may be edge cases. For algorithmic trading, prototypes may illustrate typical market conditions for different trading strategies, and criticisms may highlight atypical conditions. In credit risk analysis, prototypes could help understand typical profiles of creditworthy and non-creditworthy borrowers, while criticisms can highlight unusual cases. For example, creditworthy prototypes may be identified using clustering algorithms to find typical combinations of borrower characteristics found in low default probability borrowers. Criticisms may instead be identified using methods like anomaly detection to find low default probability borrowers that least resemble the prototypes in terms of borrower characteristics.

Local model-agnostic methods

Local model-agnostic methods focus on explaining an individual result or prediction by providing a rationale for the specific instance. Here are some examples of local model-agnostic methods and their potential applications.

Local Interpretable Model-agnostic Explanations (LIME)

LIME approximates a model locally with an interpretable model to explain individual predictions. In fraud detection, LIME can explain why a specific transaction was flagged as suspicious. In credit risk analysis, LIME can provide insights into a loan default prediction for a specific borrower or prospective borrower. For example, LIME may be applied as follows:

  1. Select the instance for explanation
    e.g. Choose a loan default model’s prediction for a prospective borrower.
  2. Perturb the instance
    e.g. Slightly modify the prospective borrower’s net worth, income, and/or other borrower characteristics to create synthetic data points.
  3. Get predictions for synthetic data points
    e.g. Use the loan default prediction model to generate predictions for the synthetic data points.
  4. Train a simple interpretable model
    e.g. Fit a linear regression model to the synthetic data and respective predictions. This is a simple, interpretable model that approximates the loan default prediction model’s behaviour at the local level.
  5. Analyse the simple model’s coefficients
    e.g. Analyse the linear regression model to determine which borrower characteristics (e.g., net worth, income) were most influential in the loan default model’s prediction for the prospective borrower.

SHAP (SHapley Additive exPlanation)

SHAP, based on the game theory concept of Shapley values, can offer a theoretically sound explanation of how each input feature contributes to a model’s output. SHAP may have similar use cases as LIME. While SHAP is a more complex and computationally intensive algorithm, it can be intuitively used to explain a specific loan default prediction in our example scenario as follows:

  1. Treat each input feature as a “player”
    e.g. Consider each borrower characteristic (e.g., income, postcode) as an individual contributor to the loan default prediction made by the model.
  2. For each feature, compute the average marginal contribution across all possible feature combinations
    e.g. Calculate the average marginal contribution of each borrower characteristic to the model’s prediction across all possible combinations of borrower characteristics.
  3. Sum these contributions to get Shapley values
    e.g. Aggregate the contributions to understand the individual impact of each borrower characteristic on the loan default model’s prediction.

Counterfactual explanations

These explanations show how small a change in input features could alter a model’s output. In algorithmic trading, counterfactual explanations could help traders understand how slight changes in market conditions could impact whether or not a trade is executed at a given point in time. In credit risk analysis, counterfactual explanations may provide a loan applicant who is refused a loan with suggested actions that would make them eligible for the loan. For example, suppose that the loan applicant is refused the loan due to the model predicting a loan default probability that exceeds an accepted risk threshold for loan eligibility. In this scenario, counterfactual explanations could be provided as follows:

  1. Start with the original instance’s values
    e.g. Identify the applicant’s borrower characteristics provided as input to the loan default prediction model.
  2. Search for alternative values
    e.g. Use optimisation algorithms to find the smallest possible changes to one or more of the applicant’s borrower characteristics that would bring the model’s prediction below the accepted risk threshold for loan eligibility.
  3. Present the counterfactual
    e.g. Explain to the applicant which changes (e.g. increase income by 10%, reduce debt by 40%) would make them eligible for the requested loan.

Individual Conditional Expectation (ICE) plots

ICE plots show the relationship between a feature and the predicted outcome for individual instances. For algorithmic trading, ICE plots could illustrate how market indicators influence trading decisions for individual trades. In credit risk analysis, ICE plots could provide personalised insights into how a loan applicant’s borrower characteristics affect their loan eligibility. An ICE plot for a loan applicant, for example, may generated as follows:

  1. Vary one feature over a range; fix all other features to their original values
    e.g. Vary the applicant’s income over an income range while keeping other borrower characteristics unchanged, creating a new synthetic data point for each value in the income range.
  2. Compute predictions for each synthetic data point
    e.g. Use the model to generate loan default predictions for each synthetic data point.
  3. Plot the feature’s values against predictions
    e.g. Visualise how changes in income would affect loan eligibility for the applicant.

Anchors

Anchors provide rules-based explanations for individual predictions. For algorithmic trading, anchors can explain specific trading decisions by articulating implicit rules or heuristics. In credit risk analysis, anchors can offer rules-based explanations for an applicant’s loan approval or denial. At a high level, anchors may be generated from the following steps (potentially iterating over steps 1-2 and using reinforcement learning techniques to generate improved candidates at each new iteration):

  1. Generate candidate rules
    e.g. Generate a set of if-then rules based on key borrower characteristics (e.g. income, credit history) that are most likely to explain the model’s prediction for the applicant.
  2. Evaluate the accuracy of each rule
    e.g. Test the candidate rules on other predictions made by the model for entries with similar borrower characteristics to see how well the rules explain the model’s prediction.
  3. Select and present the most accurate rule set
    e.g. Choose the set of rules that best explains the model’s prediction and apply the rules to generate a suitable explanation, such as: “You meet the basic income and credit history requirements. However, as you are self-employed, you must also meet additional income requirements for loan eligibility. You do not meet the additional income requirements at present, so you are not currently eligible for this loan.”

Challenges and current research

Many of these interpretability methods are computationally intensive, making them difficult to apply to large datasets without relying on high-performance hardware. Current research has been testing high-performance solutions that can be easily rolled out across the financial services. For example, a recent case study from Nvidia showed promising results using graphics processing units (GPUs) to accelerate SHAP for risk management, assessment and scoring of credit portfolios in traditional banks, as well as in fintech platforms for peer-to-peer (P2P) lending and crowdfunding. This was a major undertaking that saw Nvidia collaborate with Hewlett Packard Enterprise, more than 20 universities, and the European supervisory and financial service community, including the Gaia-X Financial AI Cluster (FAIC) project. 

Given the current trajectory, there seems to be no reason why XAI should not be adopted at scale in the financial services sector in the not-so-distant future.

 

For a deeper dive into machine learning interpretability methods from a purely technical perspective, I highly recommend Christoph Molnar's Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed., 2022), which I also referenced while writing this post.