In this blog I’ll try to demonstrate how to explain Black-box ML models (using numeric data as train set) using LIME. I’ve used this python package and most of the examples I found for using this package were for image data. That’s why I wanted to share my journey towards explaining numeric data using LIME.
What is in this tutorial
- Background about LIME
- Training and using black-box KNN and XgBoost algorithm for credit card fraud detection.
- Using lime To explain decisions made by KNN and XgBoost algorithm.
What is LIME
According to the authors of the paper, Lime is
a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction.
This video presentation here helped me to understand about LIME very much.
After reading the paper and watching this tutorial I wanted to do some experiments with LIME with the Credit Card fraud detection dataset. I used KNN & XgBoost algorithms to classify the data to “Fraud” and “Non-Fraud” class.
Training The Model
As the classification part of my code mostly used codes from this tutorial on fraud detection dataset, I won’t be focusing much about the classification part. The code I used to run train and evaluate the models is given below:
The output of the above code will be like below:
Explaining the model using LIME
First let’s list out all the attributes of the data set so that they can be properly labeled while explaining:
Now, Let’s create an explainer with this feature set and data set and then initialize a random instance that we want to explain using LIME.
Now we can explain the ith instance of the dataset using LIME for both of the models:
The output would be something like this: