Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Machine Learning

Setting up a Machine Learning Experiment Tracking Service for your Team with AWS and MLflow | by Eduardo Alvarez | Sep, 2022

admin by admin
September 7, 2022
in Machine Learning


Data Science teams are all too familiar with the pain of tracking the hundreds of model iterations that can be generated through a single experiment/project. One of the common ways people manage this today is to have shared file systems where teams can save their models with an agreed-upon nomenclature like:

ProjectID_ModelType_DatasetIteration_ImageID_NumLayers_Neurons.model

This is neither efficient nor practical for teams to manage their machine learning model life cycles in a sanitized fashion. Following this methodology, you will soon find yourself tearing your hair out as you try to remember the model creation process’s who, when, where, and how.

Proposed Solution

All teams, big and small, should employ a cloud-hosted MLflow remote server. Members of teams should log code versions, data versions, model metadata, and tags along with each model that is trained as part of a particular experiment/project.

MLflow is an open-source machine learning lifecycle management platform. MLflow works with any ML library that runs in the cloud and is easily scalable for big data workflows. MLFlow is composed of four different parts:

  • MLflow Tracking: Allows you to record and query code, data, configurations, and results of the model training process.
  • MLflow Projects: This allows you to package and redeploy your data science code to enable reproducible runs on any platform.
  • MLflow Models: This is an API for easy model deployment into various environments
  • MLflow Model Registry: This allows you to store, annotate, and manage models in a central repository.

This article will explore how you can set up a password-protected MLflow server for your entire team. The main focus will be model tracking and model registry, the two most critical components of making this a helpful environment for your team.

Everything in this tutorial will be done using AWS’ free tier services, but your team might require a more robust setup. At the end of the article, I’ll leave a set of recommended cloud configurations you can use to set up a production-ready version of this demo.

MLflow Server — Solution Architecture

Designed by Eduardo Alvarez using drawio — An architecture diagram of a sample MLflow remote service

This tutorial will be broken down into five key components:

  • Standing up the required AWS Services: EC2, S3, and RDS.
  • Configuring your AWS CLI and SSH into EC2
  • Installing all dependencies into the Instance
  • Configuring your Remote Server
  • Launching the MLflow UI from Browser

Setting up Host Machine with AWS EC2

Go to your AWS management console and launch a new EC2 instance.

  1. In this tutorial, we will select an Amazon Linux OS, but you’re welcome to use other Linux and Windows OS from the list.
Image of AWS EC2 Launch Wizard

2. Create a new Key Pair if you don’t already have one. You will need this to SSH into your instance and configure various aspects of your server.

RSA key config

3. Create a new security group and allow SSH, HTTPS, and HTTP traffic from Anywhere (0.0.0.0/0) — WARNING: For maximum security, it is recommended that you safelist specific IPs that should have access to the server instead of having a connection that is open to the entire internet. For this tutorial, feel free to leave it open to traffic from anywhere.

Selection of security groups

4. The EC2 storage volume doesn’t need to be any bigger than 8–16GB for your server unless you intend to use this instance for other purposes. My recommendation would be that you leave this instance as a dedicated EC2 server.

Define block storage for EC2 Instance

Setting up S3 Bucket — Object Store

Your S3 bucket can be used to store model artifacts, environment configurations, code versions, and data versions. It contains all the vital organs of your MLflow machine learning management pipeline.

1. Create a new bucket from the AWS management console.

2. Enable ACLs and leave all public access blocked. ACLs will allow other AWS accounts with proper credentials to access the objects in the bucket.

Setup AWS RDS Postgres Database

This component of the MLflow workflow will be in charge of storing runs, parameters, metrics, tags, notes, paths to object stores, and other metadata.

  1. Create a database from the AWS management console. We will be working with a PostgreSQL database.

2. Select the Free Tier

3. Give your DB a name, assign a master username, and provide a password for this account. You will be using the information to launch your MLflow server.

4. Select an instance type. Depending on how much traffic you intend to feed through the MLflow server, i.e., how many times you will be querying models and data, you might want to provide a beefier instance.

You will also need to specify the storage. Again this will depend on how much metadata you need to track. Remember, models and artifacts will not be stored here, so don’t expect to need excessive space.

5. Public access is essential to allow others outside your VPC to write/read the DB. In the next step, you can specify the safelist IPs using the security group.

6. You will need to create a security group with an inbound rule that allows all TCP traffic from anywhere (0.0.0.0/0) or specific IPs.

7. Launch RDS

Install AWS CLI and Access Instance with SSH

AWS CLI is a tool that allows you to control your AWS resources from the command line. Follow this link to install the CLI from the AWS website.

Upon installing the AWS CLI, you can configure your AWS credentials to your machine to permit you to write to the S3 object store. When incorporating MLflow into scripts, notebooks, or APIs, you will need this.

  1. From your EC2 instance Dashboard, select your MLflow EC2 instance and click “connect.”

2. Navigate to the “SSH client” tab and copy the SSH example to your clipboard from the bottom of the prompt.

3. Paste the SSH command into your prompt (I’m using Git Bash here from the folder where I have stored my .pem file)

accessing EC2 from AWS CLI using SSH

4. Install the following dependencies

sudo pip3 install mlflow[extras] psycopg2-binary boto3

sudo yum install httpd-tools

sudo amazon-Linux-extras install nginx1

5. Create an nginx user and set password

sudo htpasswd -c /etc/nginx/.htpasswd testuser

setting nginx access credentials

6. Next, we will configure the nginx reverse proxy to port 5000

sudo nano /etc/nginx/nginx.conf

Using nano, we can edit the nginx config file to add the following information about the reverse proxy:

location / {
proxy_ass
http://localhost:5000/;
auth_basic “Restricted Content”;
auth_basic_user_file /etc/nginx/.htpasswd;
}

Your final script should look something like this (I’ve drawn a red bracket around the part you need to add)

final nginx config file

Start nginx and MLflow Server.

Now you can start your nginx reverse proxy server and run your MLflow remote servers. This part requires you to retrieve some information from the AWS management console.

  1. Start our nginx server.

sudo server nginx start

2. Start your MLflow server and configure all components of your object storage (S3) and backend storage (RDS)

mlflow server --backend-store-uri postgresql://MASTERUSERNAME:YOURPASSWORD@YOUR-DATABASE-ENDPOINT:5432/postgres --default-artifact-root s3://BUCKETNAME --host 0.0.0.0

Where can you find this info?

  • MASTERUSERNAME — the user name that you set for your PostgreSQL RDS DB
  • YOURPASSWORD — the password you set for your PostgreSQL RDS DB
  • YOUR-DATABASE-ENDPOINT — this can be found in your RDS DB information within the AWS management console
  • BUKCETNAME — Name of your S3 bucket

Once executing this command, you should get a massive printout and information about your worker and pid IDs.

Screenshot of Mlflow server running

Shutting down your MLflow Server or Running in Background

If you would like to shut down your MLflow server at any time, you can use the following command: sudo fuser -k port/tcp

If you want to run it in the background so that you can close the terminal and continue running the server, you should add nohup and & to your MLflow server launch command.

nohup mlflow server --backend-store-uri postgresql://MASTERUSERNAME:YOURPASSWORD@YOUR-DATABASE-ENDPOINT:5432/postgres --default-artifact-root s3://BUCKETNAME --host 0.0.0.0 &

Accessing the MLflow UI from your Browser

Now that you have correctly set up your entire MLFlow server infrastructure, you can start interacting with it from the command line, scripts, etc.

To access the MLflow UI, you need your EC2 instances IPV4 Public IP address.

  1. Copy and paste your IPV4 Public IP (this can be found inside your EC2 management console). Ensure that you add http:// before the IP address.
  2. You will be prompted to enter your nginx username and password.
nginx username and password request prompt

3. Voilà! You can now access and manage your experiments and models from the MLflow UI

MLflow UI example

Conclusion

This article presented a detailed guide for setting up your team’s MLflow remove server using AWS infrastructure. This tool will help your team stay more organized throughout your machine learning lifecycle and ensure that you put the highest quality data science products into production. In future articles, I will cover some MLflow best practices such as data and code versioning, containerization, and model management.

If you have questions or would like to talk about anything data science, connect with me on LinkedIn



Source link

Previous Post

Improving Machine Learning Outcomes | by John Hawkins

Next Post

Semi-Supervised Approach for Transformers [Part 2] | by Divyanshu Raj | Sep, 2022

Next Post
Semi-Supervised Approach for Transformers [Part 2] | by Divyanshu Raj | Sep, 2022

Semi-Supervised Approach for Transformers [Part 2] | by Divyanshu Raj | Sep, 2022

NuNet Development Update: September 2022 | by NuNet Team | NuNet | Sep, 2022

Making pictures with words. The rise and rise of AI text-to-image… | by Sau Sheong | Sep, 2022

Related Post

Artificial Intelligence

Exploring TensorFlow Model Prediction Issues | by Adam Brownell | Feb, 2023

by admin
February 2, 2023
Machine Learning

Different Loss Functions used in Regression | by Iqra Bismi | Feb, 2023

by admin
February 2, 2023
Machine Learning

How to organize bills? – 3 ways to track bills

by admin
February 2, 2023
Artificial Intelligence

How to decide between Amazon Rekognition image and video API for video moderation

by admin
February 2, 2023
Artificial Intelligence

The Future of AI: GPT-3 vs GPT-4: A Comparative Analysis | by Mohd Saqib | Jan, 2023

by admin
February 2, 2023
Deep Learning

6 Ways To Streamline Tech Hiring With A Recruitment Automation Platform

by admin
February 2, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.