Data Science teams are all too familiar with the pain of tracking the hundreds of model iterations that can be generated through a single experiment/project. One of the common ways people manage this today is to have shared file systems where teams can save their models with an agreed-upon nomenclature like:
ProjectID_ModelType_DatasetIteration_ImageID_NumLayers_Neurons.model
This is neither efficient nor practical for teams to manage their machine learning model life cycles in a sanitized fashion. Following this methodology, you will soon find yourself tearing your hair out as you try to remember the model creation process’s who, when, where, and how.
Proposed Solution
All teams, big and small, should employ a cloud-hosted MLflow remote server. Members of teams should log code versions, data versions, model metadata, and tags along with each model that is trained as part of a particular experiment/project.

MLflow is an open-source machine learning lifecycle management platform. MLflow works with any ML library that runs in the cloud and is easily scalable for big data workflows. MLFlow is composed of four different parts:
- MLflow Tracking: Allows you to record and query code, data, configurations, and results of the model training process.
- MLflow Projects: This allows you to package and redeploy your data science code to enable reproducible runs on any platform.
- MLflow Models: This is an API for easy model deployment into various environments
- MLflow Model Registry: This allows you to store, annotate, and manage models in a central repository.
This article will explore how you can set up a password-protected MLflow server for your entire team. The main focus will be model tracking and model registry, the two most critical components of making this a helpful environment for your team.
Everything in this tutorial will be done using AWS’ free tier services, but your team might require a more robust setup. At the end of the article, I’ll leave a set of recommended cloud configurations you can use to set up a production-ready version of this demo.
MLflow Server — Solution Architecture

This tutorial will be broken down into five key components:
- Standing up the required AWS Services: EC2, S3, and RDS.
- Configuring your AWS CLI and SSH into EC2
- Installing all dependencies into the Instance
- Configuring your Remote Server
- Launching the MLflow UI from Browser
Setting up Host Machine with AWS EC2
Go to your AWS management console and launch a new EC2 instance.
- In this tutorial, we will select an Amazon Linux OS, but you’re welcome to use other Linux and Windows OS from the list.

2. Create a new Key Pair if you don’t already have one. You will need this to SSH into your instance and configure various aspects of your server.

3. Create a new security group and allow SSH, HTTPS, and HTTP traffic from Anywhere (0.0.0.0/0) — WARNING: For maximum security, it is recommended that you safelist specific IPs that should have access to the server instead of having a connection that is open to the entire internet. For this tutorial, feel free to leave it open to traffic from anywhere.

4. The EC2 storage volume doesn’t need to be any bigger than 8–16GB for your server unless you intend to use this instance for other purposes. My recommendation would be that you leave this instance as a dedicated EC2 server.

Setting up S3 Bucket — Object Store
Your S3 bucket can be used to store model artifacts, environment configurations, code versions, and data versions. It contains all the vital organs of your MLflow machine learning management pipeline.
1. Create a new bucket from the AWS management console.

2. Enable ACLs and leave all public access blocked. ACLs will allow other AWS accounts with proper credentials to access the objects in the bucket.

Setup AWS RDS Postgres Database
This component of the MLflow workflow will be in charge of storing runs, parameters, metrics, tags, notes, paths to object stores, and other metadata.
- Create a database from the AWS management console. We will be working with a PostgreSQL database.

2. Select the Free Tier

3. Give your DB a name, assign a master username, and provide a password for this account. You will be using the information to launch your MLflow server.

4. Select an instance type. Depending on how much traffic you intend to feed through the MLflow server, i.e., how many times you will be querying models and data, you might want to provide a beefier instance.
You will also need to specify the storage. Again this will depend on how much metadata you need to track. Remember, models and artifacts will not be stored here, so don’t expect to need excessive space.

5. Public access is essential to allow others outside your VPC to write/read the DB. In the next step, you can specify the safelist IPs using the security group.

6. You will need to create a security group with an inbound rule that allows all TCP traffic from anywhere (0.0.0.0/0) or specific IPs.
7. Launch RDS
Install AWS CLI and Access Instance with SSH
AWS CLI is a tool that allows you to control your AWS resources from the command line. Follow this link to install the CLI from the AWS website.
Upon installing the AWS CLI, you can configure your AWS credentials to your machine to permit you to write to the S3 object store. When incorporating MLflow into scripts, notebooks, or APIs, you will need this.
- From your EC2 instance Dashboard, select your MLflow EC2 instance and click “connect.”
2. Navigate to the “SSH client” tab and copy the SSH example to your clipboard from the bottom of the prompt.
3. Paste the SSH command into your prompt (I’m using Git Bash here from the folder where I have stored my .pem file)

4. Install the following dependencies
sudo pip3 install mlflow[extras] psycopg2-binary boto3
sudo yum install httpd-tools
sudo amazon-Linux-extras install nginx1
5. Create an nginx user and set password
sudo htpasswd -c /etc/nginx/.htpasswd testuser

6. Next, we will configure the nginx reverse proxy to port 5000
sudo nano /etc/nginx/nginx.conf
Using nano, we can edit the nginx config file to add the following information about the reverse proxy:
location / {
proxy_ass http://localhost:5000/;
auth_basic “Restricted Content”;
auth_basic_user_file /etc/nginx/.htpasswd;
}
Your final script should look something like this (I’ve drawn a red bracket around the part you need to add)

Start nginx and MLflow Server.
Now you can start your nginx reverse proxy server and run your MLflow remote servers. This part requires you to retrieve some information from the AWS management console.
- Start our nginx server.
sudo server nginx start
2. Start your MLflow server and configure all components of your object storage (S3) and backend storage (RDS)
mlflow server --backend-store-uri postgresql://MASTERUSERNAME:YOURPASSWORD@YOUR-DATABASE-ENDPOINT:5432/postgres --default-artifact-root s3://BUCKETNAME --host 0.0.0.0
Where can you find this info?
- MASTERUSERNAME — the user name that you set for your PostgreSQL RDS DB
- YOURPASSWORD — the password you set for your PostgreSQL RDS DB
- YOUR-DATABASE-ENDPOINT — this can be found in your RDS DB information within the AWS management console
- BUKCETNAME — Name of your S3 bucket
Once executing this command, you should get a massive printout and information about your worker and pid IDs.

Shutting down your MLflow Server or Running in Background
If you would like to shut down your MLflow server at any time, you can use the following command: sudo fuser -k port/tcp
If you want to run it in the background so that you can close the terminal and continue running the server, you should add nohup
and &
to your MLflow server launch command.
nohup mlflow server --backend-store-uri postgresql://MASTERUSERNAME:YOURPASSWORD@YOUR-DATABASE-ENDPOINT:5432/postgres --default-artifact-root s3://BUCKETNAME --host 0.0.0.0 &
Accessing the MLflow UI from your Browser
Now that you have correctly set up your entire MLFlow server infrastructure, you can start interacting with it from the command line, scripts, etc.
To access the MLflow UI, you need your EC2 instances IPV4 Public IP address.
- Copy and paste your IPV4 Public IP (this can be found inside your EC2 management console). Ensure that you add http:// before the IP address.
- You will be prompted to enter your nginx username and password.

3. Voilà! You can now access and manage your experiments and models from the MLflow UI

Conclusion
This article presented a detailed guide for setting up your team’s MLflow remove server using AWS infrastructure. This tool will help your team stay more organized throughout your machine learning lifecycle and ensure that you put the highest quality data science products into production. In future articles, I will cover some MLflow best practices such as data and code versioning, containerization, and model management.
If you have questions or would like to talk about anything data science, connect with me on LinkedIn