Noam Bressler's Talk at Databricks' Data+AI Summit
Explainable Data Drift for NLP
Detecting data drift, although far from solved-for tabular data, has become a common approach to monitor ML models in production. For Natural Language Processing (NLP) on the other hand the question remains mostly open.
In this session, we will present and compare two approaches. In the first approach, we will demonstrate how by extracting a wide range of explainable properties per document such as topics, language, sentiment, named entities, keywords and more we are able to explore potential sources of drift.
We will show how these properties can be consistently tracked over time, how they can be used to detect meaningful data drift as soon as it occurs and how they can be used to explain and fix the root cause.
