Note
Go to the end to download the full example code
Reactor Example#
An example to showcase the rich explanations from the Diveplane client.
Diveplane Reactor allows Python users to access Diveplane’s full functionality with a user friendly API.
Below is an example use of the diveplane.reactor.Trainee
class. The
example covers the basic steps in a typical Diveplane workflow:
Creation of a trainee.
Training.
Analysis of the trainee.
Reaction by the trainee to new data, which produces predicted values for the action features.
Extraction of audit data that explains the prediction.
Reading breast cancer data set.
Training on a random selection of 80% of the data.
Number of trained cases: 546
Analyzing the trainee.
Reacting to 20% reserve test data.
Prediction stats:
Accuracy: 95.8%
Mean Absolute Error: 0.05342250465543175
Printing details for most similar cases from the first prediction:
[{'.distance': 15.432275391989362,
'.session': 'c61b1454-f694-4d3e-b9e5-9242ff2e8172',
'.session_training_index': 120,
'x1': 5,
'x2': 1,
'x3': 3,
'x4': 1,
'x5': 2,
'x6': 1,
'x7': 2,
'x8': 1,
'x9': 1,
'y': 0},
{'.distance': 15.432275391989362,
'.session': 'c61b1454-f694-4d3e-b9e5-9242ff2e8172',
'.session_training_index': 73,
'x1': 5,
'x2': 1,
'x3': 3,
'x4': 1,
'x5': 2,
'x6': 1,
'x7': 2,
'x8': 1,
'x9': 1,
'y': 0},
{'.distance': 15.432275391989362,
'.session': 'c61b1454-f694-4d3e-b9e5-9242ff2e8172',
'.session_training_index': 461,
'x1': 5,
'x2': 1,
'x3': 3,
'x4': 1,
'x5': 2,
'x6': 1,
'x7': 1,
'x8': 1,
'x9': 1,
'y': 0}]
import os
from diveplane import reactor
from diveplane.utilities import infer_feature_attributes
import pandas as pd
from pprint import pprint
# sphinx_gallery_thumbnail_path = '_static/gallery/diveplane-ai.png'
# Get path of breast cancer data included in the python package.
data_path = os.path.join("breast_cancer.csv")
# Read in the breast cancer data.
print("Reading breast cancer data set.")
df = pd.read_csv(data_path)
# Define features for the trainee.
features = infer_feature_attributes(df)
feature_names = df.columns.tolist()
action_features = feature_names[-1:]
context_features = feature_names[:-1]
# Shuffle the data.
df = df.sample(frac=1).reset_index(drop=True)
# Split the data into an 80% training set and 20% test set.
test_percent = 0.2
data_train = df[:int(len(df) * (1 - test_percent))]
data_test = df[int(len(df) * -1 * test_percent):]
# Remove the target column from the test set
data_test_no_target = data_test.drop(action_features, axis=1)
# Create the trainee, using a context manager so resources are released
# once complete.
with reactor.Trainee(features=features) as t:
# Train the cases into the trainee.
print("Training on a random selection of 80% of the data.")
t.train(data_train)
print(f"Number of trained cases: {t.get_num_training_cases()}")
# Run analysis on the trainee.
print("Analyzing the trainee.")
t.analyze(context_features, action_features)
# React to the trainee with the context feature values.
print("Reacting to 20% reserve test data.")
details = {'feature_mda': True,
'feature_residuals': True,
'influential_cases': True,
'num_most_similar_cases': 3,
'num_boundary_cases': 3,
'case_feature_residuals': True}
result = t.react(
data_test_no_target,
action_features=action_features,
context_features=context_features,
details=details)
# Retrieve the prediction stats from the trainee
t.react_into_trainee(residuals=True)
stats = t.get_prediction_stats(stats=['accuracy', 'mae'])
accuracy = stats[action_features[0]]['accuracy']
mae = stats[action_features[0]]['mae']
# Print the accuracy of the reaction
print("Prediction stats:")
print(f"Accuracy: {accuracy:.1%}")
print(f"Mean Absolute Error: {mae}")
# Print a detailed result from audit details.
print("Printing details for most similar cases from the first prediction:")
pprint(result['explanation']['most_similar_cases'][0])
Total running time of the script: ( 0 minutes 22.731 seconds)