ML Models Testing
solution for comprehensively validating your machine
learning models and data with minimal effort, in both the
research and the production phases.
Key Capabilities of ML Testing
When you have a fresh dataset, and
want to validate your data’s
correctness and uncover
inconsistencies such as conflicting
labels or data duplicates.
When you have a trained model, and
want to examine performance metrics,
compare it to various benchmarks, and
create a clear and granular picture for
validating the model’s behavior (e.g.
are segments where it under-
When you have separate datasets
(such as train and test, or training data
collected at different times), and want
to validate that they are representative
of each other and don’t have issues
such as drift or leakage.
ML Validation Continuity from
Research to Production
You can use the exact set (or a subset)
of the checks that were used during
research for CI/CD and Production
monitoring. That ensures that the deep
knowledge that you data science
team has will be used by the ML
Engineers in later model/data lifecycle
Code-Level Root Cause
You can segment the data to get to the
area where the model/data seem to
fail and then handle that to the data
science team for code level analysis.
This means quicker root cause analysis
cycles (up to 70% of the time is usually
spent on the initial analysis, which is
Deepchecks Open Source: For ML Practitioners From Research to Production
How Does It Work?
Suites are composed of checks. Each check contains outputs displayed in a notebook and/or conditions with a pass/fail output.
Conditions can be added or removed from a
Checks can be edited or added/removed to a
Suites can be created from scratch or forked
from an existing suite.
Testing: Key Features & Checks
suite = data_integrity()
suite_result = suite.run(train_dataset)
check = StringMismatch()
result = check.run(dataset)
Train Test Validation
suite = train_test_validation()
suite_result = suite.run(train_dataset, test_dataset)
check = PredictionDrift()
result = check.run(train_dataset, test_dataset)
suite = model_evaluation()
suite_result = suite.run(train_dataset, test_dataset, model)
check = WeakSegmentsPerformance()
result = check.run(test_dataset, model)
Checks for Unstructured Data
pip install -U “deepchecks[nlp]”
pip install -U “deepchecks[nlp-properties]”
pip install -U “deepchecks[vision]”