Introduction
The use-cases for Computer Vision (CV) Models vary from boosting company effectiveness to automating crucial decision-making processes. However, it takes a lot of time to find and/or create a suitable database and perform data cleaning before using the data for training and testing. Moreover, it takes more than a singular entity to perform different tasks such as data curation, data cleaning, modeling, and testing. Having put in all the time and resources to develop a model, if the model fails to exhibit anticipated behavior in production, it could result in major setbacks and wasted time and resources. To avoid that disaster, it is important to employ a set of key practices to ensure a model functions smoothly in production.
In this blog, we explore the different methods and practices that should be employed for the simplified and effective deployment of Computer Vision models. Let’s start by understanding different deployment strategies.
CV Model Deployment Strategies
1. CV Deployment via REST APIs
An API (Application Programming Interface) enables the simultaneous use of your model by numerous applications written in any language or framework.
REST (Representational State Transfer) represents a client-server pair that allows sharing data across multiple nodes. REST APIs are APIs that follow the rules set by the REST architecture. A RESTful API transmits a description of the resource’s state to the submitter/endpoint when a client request is made through it. This data, or representation, is transmitted via HTTP in one of a number of formats to or from the endpoint. FlaskRestFul can be used to build and deploy REST API for your model. You can read more about it here.
2. CV API Tools
There are plenty of tools in the market that offer computer vision APIs that can be incorporated directly into your programme. It’s important to note that among those APIs, you have the choice of creating your own UI or using the standard UIs provided by the service provider.
- AWS Rekognition
This is a component of the larger family of AWS Cloud offerings. With this tool, you can automatically extract data from images, movies, and other types of media. Rekognition is a good choice for a beginner since it doesn’t require a thorough understanding of Computer Vision Theory. Content filtering, facial recognition, video analytics, word recognition, and other pre-built options are available here.
Amazon Rekognition provides two types of API Amazon RekognitionImage and Amazon Rekognition Video. They analyze photos and videos, respectively. The necessary data is uploaded into an S3 bucket (a storage service from Amazon) and is made accessible using the AWS CLI and SDKs. These APIs can be customized according to the use-case or directly used as is in your model. You can use them as an internal API through REST endpoints.
3. Edge Deployment for Computer Vision
Edge Devices are often compact, lightweight gadgets where a Computer Vision application can be installed and run. The use of a wider variety of models and more complicated applications is made possible by the Graphical Processing Units (GPU) or Visual Processing Units (VPU) that are present in many modern edge devices. An Edge Device is a device like Raspberry Pi, an NVIDIA Jetson device like the Jetson Nano or TX2, or a variety of Internet of Things (IoT) devices.
Although the application can be deployed using an active network, being on an Edge Device allows it to operate without cloud access. Another key advantage of Edge Deployment is that no extra time is required for inference since the model has been embedded within the device itself.
CV Model Deployment Practices
1. Unit Testing and Data Checks for CV models
Unit testing is crucial for the creation of test driver software. According to experts, functions (which make up the majority of the source code) are checked by unit tests to ensure they produce the intended outcomes.
Unit tests are lightweight and don’t need a lot of computing . There are certain python libraries that can assist you in configuring frameworks that help with unit testing. Declaring the dimensions of an image returned by a particular loader function is an example of unit testing in CV modeling.
When it comes to CV modeling, Deepchecks can be used. Consider we need to check whether the color averages are consistent between the training and the testing data. This can be done with the help of this block of code using Deepchecks:
from deepchecks.vision.datasets.detection.coco import load_dataset train_ds = load_dataset(train=True, object_type='VisionData') test_ds = load_dataset(train=False, object_type='VisionData') result = ColorAveragesCheck().run(train_ds, test_ds) result
Python code to check consistency between color averages across train and test data
Read more about Computer Vision custom checks here.
2. Test Driven Deployment (TDD)
TDD ensures that individual parts of the code are tried and tested before planning out and developing the entire code base directly – without testing. Reducing effort, avoiding repetitive mistakes, and timely feedback are some of the key features of TDD.
3. Model Monitoring
After the model has been implemented, it is crucial to keep track of its performance in determining whether or not it is operating as anticipated. Once launched, a lot of things could go wrong (data drift, production failure, etc.). Finding any data change or relationship between the input and the goal variable by monitoring the model is helpful. Deepchecks has created a comprehensive list of checks and has offered related tests that can help us identify the things that could go wrong with our model.
4. Addressing Data Drifts
Data drifts are very common for Computer Vision models. They refer to the change in the data observed as compared to the one that the model has been trained and tested on. Let’s take an example. Suppose there is a massive data drift when a model that is trained on apple pictures is fed orange pictures during testing. With the apple images and the related annotations, our model teaches itself to recognize apples in future fruit images. Assuming that apples never change, the results should be consistent and a well-trained model should continue to perform well by successfully recognizing Apples. But since we changed our test set from apples to oranges, the machine no longer correctly interprets the data and makes incorrect inferences.
Tools like Arthur are a brilliant way of addressing data drifts within our model. They help with both detection of data drifts and provide solutions to resolve the issue.
5. Economics and Cost Management for CV models
When forming the workforce or selecting rollout technologies, a cost-benefit analysis makes a lot of sense. Each tool has its advantages and disadvantages. A pricey cloud-native integrated platform could significantly reduce the expenses associated with training and resources, but at a costly membership fee. In contrast to NLP or traditional Machine Learning model advancements, a computer vision model could cause additional tool overheads such as specialized annotation tools.
Just likehow more than 85% of the ML models that are developed never make it to production, the same is true for CV models. It is important to correctly identify tools and resources that fall within the practical boundary and budgetary constraints of the project.
Factors like testing, test-driven deployment, model monitoring, data drifts, and cost management make up the best practices in the deployment of Computer Vision models. When employed properly, these help us avoid code repetition and wasted efforts. It becomes clear why a number of factors like deployment strategy and development tools need to be considered and tried for smooth execution of the model in production.
To explore Deepchecks’ open-source library, go try it out yourself! Don’t forget to star their Github repo, it’s really a big deal for open-source-led companies like Deepchecks.
Discarded:
This data can be in the form of XML or JSON. REST/RPC endpoints serve as the communication’s endpoints when an ML deployment API communicates with another system. An endpoint for APIs typically consists of the URL of a server or service. It functions like a software middleman that enables data flow between two apps.
- Google’s Cloud Vision APIÂ
Google’s Vision API provides highly effective pre-trained Machine Learning and Computer Vision models. These models can be pre-trained or customized as needed It can label photos and swiftly classify them into specified categories, much like Amazon’s Rekognition. It can recognize faces and objects and read both handwritten and printed text. Google’s Cloud Vision API is a pre-trained model and can be accessed using REST endpoints.
Here is the entire workflow of the Google’s Cloud Vision API: