Validation Report

Validation reports are generated to evaluate the model's quality. So rather than writing functions on your own, we can use this report to quickly understand how the models perform.

A validation report is a markdown report with various phases that explain the model's various abilities from which we can infer it's real-world performance.

How to access the feature?

from sanatio.validations.routines import ValidationRoutine

val_object = ValidationRoutine(...parameters...)

Parameters required for initialization

Not all the parameters are required for creating an object. Initialize only the parameters that will be required for the routine you need.

All parameters are set to None by default, and if any parameters are missing for a specific routine, you will receive an error message indicating that a parameter is missing.

Routines

Routines are used to generate different validation reporting structure which depends upon the type of mode you use.

The routines that are available in Sanatio with their respective function calls are

  • Binary logistic regression routine

    • binary_logistic_regression_routine()

  • Linear regression routine

    • linear_regression_routine()

  • Tree based classification routine

    • tree_based_classification_routine()

How to validate the model?

  1. Import the ValidationRoutine class from sanatio.validations.

  2. Create the validation object and initialize with required parameters as per your model.

  3. Call the specific routine function you need

from sanatio.validations.routines import ValidationRoutine

val_object = ValidationRoutine(...parameters...)
val_object.binary_logistic_regression_routine()

Example

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.linear_model import LogisticRegression

data = load_breast_cancer()
X,y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

lr =LogisticRegression()
lr.fit(X_train,y_train)
weights = lr.coef_
prediction = lr.predict(X_test)
one_class = lr.predict_proba(X_test)[:,1]
two_class = lr.predict_proba(X_test)
from sanatio.validations.routines import ValidationRoutine
obj = ValidationRoutine(predicted=prediction,actual=y_test,
                        weight=weights.transpose(), data=pd.DataFrame(X_test,columns=data.feature_names),
                        predicted_probability=one_class,
                        two_class_probability = two_class,
                        pearson_threshold=0.4,vif_factor=5,cat_columns=None)

obj.binary_logistic_regression_routine()

Last updated