DEEPCHECKS GLOSSARY

AWS Sagemaker

Cloud-based solutions for machine learning development are more viable in this fast-paced world. They provide scalable, flexible, and cost-effective platforms with end-to-end ML workflows.

Amazon SageMaker, a solution provided by Amazon Web Services, is one of the most robust tools for providing cloud-based platform services. It allows companies to easily build and maintain customized ML and AI solutions. Let’s discuss how SageMaker can be useful in automating and improving the ML pipeline.

AWS Sagemaker

What is Amazon SageMaker?

AWS SageMaker is a cloud-based ML development platform similar to the Jupyter Notebook local environment. SageMaker provides an environment to develop, train, fine-tune, and deploy our ML models into production.

Key features and benefits of SageMaker

  • Web-based IDE (Tools for data preparation, model building, and fine-tuning in one place)
  • Simplified training process (Fully managed and scalable infrastructure)
  • Automated hyperparameter tuning
  • Numerous deployment possibilities
  • Built-in tools for monitoring and management
  • Human-in-the-loop capabilities
  • High data security throughout the workflow

Components of SageMaker

  • SageMaker Studio: IDE environment for ML workflow. SageMaker Studio includes features like code notebooks, data preprocessing tools, and integrated tools supporting the development, tuning, and deployment of AI/ML models.
  • SageMaker Ground Truth: Automated data labeling tool for building high-quality datasets.
  • SageMaker Data Wrangler: Visual interface with features for data exploration, data cleaning, and feature engineering.
  • SageMaker Experiments: A managing tool for tracking ML experiments providing a complete view of the experiments.
  • SageMaker Autopilot: An AutoML (no-code or low-code) service that can create classification and regression models with less effort — it finds the best-performing model. This generates notebooks with a detailed model creation process.
  • SageMaker Debugger: Monitors metrics and parameters during training, detecting anomalies and providing real-time alerts and detailed analysis reports to improve model performance.
  • SageMaker Model Monitor: Monitors the performance of deployed models, detecting model degradation, data drift, and other errors in real time.
  • SageMaker Neo: Optimizes and compiles machine learning models into efficient formats that run faster and use less memory on cloud instances, edge devices, and other hardware environments.
  • SageMaker Clarify: This tool detects bias in data, generates reports for model predictions, meets ethical and regulatory standards, and ensures fairness and transparency for the models.
  • SageMaker Edge Manager: Enables a simplified process for deploying, monitoring, and managing ML models on edge devices.

SageMaker offers several other features for the ML model pipeline. For more information, refer to AWS’s official documentation.

Deepchecks For LLM VALIDATION

AWS Sagemaker

  • Reduce Risk
  • Simplify Compliance
  • Gain Visibility
  • Version Comparison
TRY LLM VALIDATION

How Does SageMaker Work?

Let’s consider an example of how SageMaker would help provide a solution for ML Model development.

Example: Protective wear detection in a warehouse

Steps:

  1. Data preparation
  2. Model development and training
  3. Model deployment

SageMaker Work

SageMaker Studio provides an integrated development environment with all the components mentioned above.

1. Data Preparation: First, we must annotate our dataset with people wearing safety jackets and other safety gear. This data could contain images and videos of warehouse workers inside a warehouse, which you can store in an AWS S3 bucket. For this task, you can use SageMaker Ground Truth to import the data from S3 and label it. Then, use SageMaker Data Wrangler to analyze your dataset and store everything in the S3 bucket.

2. Model development and training: SageMaker provides a Jupyter Notebook environment for developers to share live code with team members. There are two approaches to implementing code with a SageMaker notebook: 1) Create an Amazon EC2 instance directly, or 2) use SageMaker Studio as a web-based IDE instance.

Amazon SageMaker can create a managed instance for model development using an EC2 (Elastic Compute Cloud).

  • You can create a Jupyter Notebook instance and develop the object detection model for safety wear detection. Use built-in common deep-learning libraries, drivers, packages, and frameworks for development.
  • Integrate the S3 bucket with your dataset and import it to your notebook environment for model training.
  • Use SageMaker Experiments, SageMaker Clarify, SageMaker Model Monitor, and SageMaker Debugger to optimize and fine-tune the development process.

3. Model Deployment: After training and testing the detection model, we can use SageMaker Neo to optimize and compile the model for the specific edge device you use in the warehouse to detect safety wear. In our case, we use an NVIDIA Jetson NANO device.

  • Use SageMaker Edge Manager and create an edge packaging job to package and prepare the compiled model for deployment. Then, register the model with Edge Manager, which provides a solution for deploying and managing it.
  • Set up AWS Greengrass and create a Greengrass component for our packaged model from Edge Manager. Then, AWS IoT Core can be set up to connect the device to the cloud easily and securely.
  • Then, we can deploy the Greengrass component to the Jetson device.
  • As the final step, we can use the Edge Manager to monitor the deployed model performance on the edge devices.
  • Through SageMaker, you can make changes to the model and automate the update process, monitor metrics, and manage edge devices remotely.

Solution Overview

Pricing

Amazon SageMaker provides a free tier option for up to 2 months with limited hours of use of each component.

SageMaker lets you pay for what you use, with two payment options:

  1. On-demand pricing: No minimum fees and no upfront payments. Pay per hour for each component you use in SageMaker. This is ideal for users who are looking for varying workloads.
  2. Savings Plan: This plan offers flexible, usage-based pricing over a fixed term, with a consistent amount of usage measured in $/hour.