
Deepchecks
Open Source
Deepchecks Open Source is a Python package for comprehensively validating your machine-learning models and data with minimal effort.
Open SourceKey Capabilities of Deepchecks Open Source
Data Integrity
When you have a fresh dataset and want to validate your data’s correctness and uncover inconsistencies such as conflicting labels or data duplicates.
Model Evaluation
When you have a trained model and want to examine performance metrics, compare it to various benchmarks, and create a clear and granular picture for validating the model’s behavior (e.g., are there segments where it under-performs).
Train-Test Validation
When you have separate datasets (such as train and test or training data collected at different times) and want to validate that they are representative of each other and don’t have issues such as drift or leakage.
Deepchecks Open Source: For ML Practitioners During the Research Phase
Deepchecks Open Source is a Python package for comprehensively validating your machine-learning models and data with minimal effort. It includes checks related to various issues, such as model performance, data integrity, distribution mismatches, and more. Model and data validation is one of the most important processes that data scientists and ML engineers deal with while scaling up from the “laboratory phase” to ML Systems that provide continuous value. Whether your main interest is testing, CI/CD, model auditing or plain old testing, we recommend “kicking the tires” with Deepchecks Open Source as a first step.
How Does It Work?
Suites are composed of checks. Each check contains outputs displayed in a notebook and/or conditions with a pass/fail output.
Conditions can be added or removed from a check;
Checks can be edited or added/removed to a suite;
Suites can be created from scratch or forked from an existing suite.
The “Report” beings with a list of the Conditions (passed or not) followed by the Display Data (from the various checks).
The “Report” beings with a list of the Conditions (passed or not) followed by the Display Data (from the various checks).
Key Features & Checks
Data Integrity
from deepchecks.tabular.checks import StringMismatch
StringMismatch().run(ds)
Methodology Issues

from deepchecks.tabular.checks import BoostingOverfit
BoostingOverfit().run(train_ds, validation_ds, clf)

Distribution Checks

from deepchecks.tabular.checks import FeatureDrift
FeatureDrift().run(train_dataset, test_dataset, model)

Performance Checks

from deepchecks.tabular.checks import
WeakSegmentPerformance
WeakSegmentPerformance().run(validation_ds, model)


Suites of Checks

from deepchecks.tabular.suites import
train_test_validation
train_test_validation().run(ds_train, ds_test)

Open Source & Community
Deepchecks is committed to keeping the ML validation package open-source and community-focused.