Deploy a Streamlit App to AWS

a implausible Streamlit app, and now it’s time to let the world see and use it.

What choices do you may have?

The best manner is to make use of the Streamlit Group Cloud service. That methodology lets anybody on-line entry your Streamlit app, supplied they’ve the required URL. It’s a comparatively easy course of, nevertheless it’s a publicly out there endpoint and, resulting from potential safety points and scalability choices, it isn’t an possibility for many organisations.

Since Streamlit was acquired by Snowflake, deploying to that platform is now a viable possibility as nicely.

The third possibility is to deploy to one of many many cloud providers, equivalent to Heroku, Google Cloud, or Azure.

As an AWS person, I wished to see how simple it might be to deploy a streamlit app to AWS, and that is what this text is about. For those who consult with the official Streamlit documentation on-line (hyperlink on the finish of the article), you’ll discover that there is no such thing as a info or steering on how to do that. So that is the “lacking guide”.

The deployment course of is comparatively easy. The difficult half is guaranteeing that the AWS networking configuration is ready up appropriately. By that, I imply your VPC, safety teams, subnets, route tables, subnet associations, Nat Gateways, Elastic IPS, and so on…

As a result of each organisation’s networking setup is totally different, I’ll assume that you just or somebody in your organisation can resolve this facet. Nonetheless, I embody some troubleshooting suggestions on the finish of the article for the commonest causes for deployment points. For those who comply with my steps to the letter, you ought to have a working, deployed app by the top of it.

In my pattern deployment, I’ll be utilizing a VPC with a Public subnet and an Web gateway. Against this, in real-life situations, you’ll in all probability need to use a mixture of all or a few of elastic load balancers, personal subnets, NAT gateways and Cognito for person authentication and enhanced safety. In a while, I’ll talk about some choices for securing your app.

The app we’ll deploy is the dashboard I wrote utilizing Streamlit. TDS revealed that article some time again, and yow will discover a hyperlink to it on the finish of this text. In that case, I retrieved my dashboard information from a PostgreSQL database operating domestically. Nonetheless, to keep away from the prices and problem of organising an RDS Postgres database on AWS, I’ll convert my dashboard code to retrieve its information from a CSV file on S3 — Amazon’s mass storage service.

As soon as that’s performed, it’s solely a matter of copying over a CSV to AWS S3 storage, and the dashboard ought to work simply because it did when operating domestically utilizing Postgres.

I assume you may have an AWS account with entry to the AWS console. Moreover, if you’re choosing the S3 route as your information supply, you’ll have to arrange AWS credentials. Upon getting them, both create an .aws/credentials file in your HOME listing (as I’ve performed), or you may cross your credential key info immediately within the code.

Assuming all these stipulations are met, we are able to take a look at the deployment utilizing AWS’s Elastic Beanstalk service.

What’s AWS Elastic Beanstalk (EB)?

AWS Elastic Beanstalk (EB) is a completely managed service that simplifies the deployment, scaling, and administration of functions within the AWS Cloud. It lets you add your utility code in common languages like Python, Java, .NET, Node.js, and extra. It robotically handles the provisioning of the underlying infrastructure, equivalent to servers, load balancers, and networking. With Elastic Beanstalk, you may deal with writing and sustaining your utility slightly than configuring servers or managing capability as a result of the service seamlessly scales sources as your utility’s site visitors fluctuates.

Along with provisioning your EC2 servers, and so on., EB will set up any required exterior libraries in your behalf, relying on the deployment kind. It can be configured to run OS instructions on server startup.

The code

Earlier than deploying, let’s evaluation the adjustments I made to my authentic code to accommodate the change in information supply from Postgres to S3. It boils all the way down to changing calls to learn a Postgres desk with calls to learn an S3 object to feed information into the dashboard. I additionally put the primary graphical element creation and show inside a fundamental() module, which I name on the finish of the code. Here’s a full itemizing.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import boto3
from io import StringIO

#########################################
# 1. Load Knowledge from S3
#########################################

@st.cache_data
def load_data_from_s3(bucket_name, object_key):
    """
    Reads a CSV file from S3 right into a Pandas DataFrame.
    Be sure that your AWS credentials are correctly configured.
    """
    s3 = boto3.shopper("s3")
    obj = s3.get_object(Bucket=bucket_name, Key=object_key)
    df = pd.read_csv(obj['Body'])
    
    # Convert order_date to datetime if wanted
    df['order_date'] = pd.to_datetime(df['order_date'], format='%d/%m/%Y')
    
    return df

#########################################
# 2. Helper Capabilities (Pandas-based)
#########################################

def get_date_range(df):
    """Return min and max dates within the dataset."""
    min_date = df['order_date'].min()
    max_date = df['order_date'].max()
    return min_date, max_date

def get_unique_categories(df):
    """
    Return a sorted checklist of distinctive classes (capitalized).
    """
    classes = df['categories'].dropna().distinctive()
    classes = sorted([cat.capitalize() for cat in categories])
    return classes

def filter_dataframe(df, start_date, end_date, class):
    """
    Filter the dataframe by date vary and optionally by a single class.
    """
    # Guarantee begin/end_date are transformed to datetime simply in case
    start_date = pd.to_datetime(start_date)
    end_date = pd.to_datetime(end_date)
    
    masks = (df['order_date'] >= start_date) & (df['order_date'] <= end_date)
    filtered = df.loc[mask].copy()
    
    # If not "All Classes," filter additional by class
    if class != "All Classes":
        # Classes in CSV is likely to be lowercase, uppercase, and so on.
        # Alter as wanted to match your information
        filtered = filtered[filtered['categories'].str.decrease() == class.decrease()]
    
    return filtered

def get_dashboard_stats(df, start_date, end_date, class):
    """
    Calculate complete income, complete orders, common order worth, and prime class.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return 0, 0, 0, "N/A"
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    total_revenue = filtered_df['revenue'].sum()
    total_orders = filtered_df['order_id'].nunique()
    avg_order_value = total_revenue / total_orders if total_orders > 0 else 0
    
    # Decide prime class by complete income
    cat_revenue = filtered_df.groupby('classes')['revenue'].sum().sort_values(ascending=False)
    top_cat = cat_revenue.index[0].capitalize() if not cat_revenue.empty else "N/A"
    
    return total_revenue, total_orders, avg_order_value, top_cat

def get_plot_data(df, start_date, end_date, class):
    """
    For 'Income Over Time', group by date and sum income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['date', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    plot_df = (
        filtered_df.groupby(filtered_df['order_date'].dt.date)['revenue']
        .sum()
        .reset_index()
        .rename(columns={'order_date': 'date'})
        .sort_values('date')
    )
    return plot_df

def get_revenue_by_category(df, start_date, end_date, class):
    """
    For 'Income by Class', group by class and sum income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['categories', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    rev_cat_df = (
        filtered_df.groupby('classes')['revenue']
        .sum()
        .reset_index()
        .sort_values('income', ascending=False)
    )
    rev_cat_df['categories'] = rev_cat_df['categories'].str.capitalize()
    return rev_cat_df

def get_top_products(df, start_date, end_date, class, top_n=10):
    """
    For 'Prime Merchandise', return prime N merchandise by income.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame(columns=['product_names', 'revenue'])
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    top_products_df = (
        filtered_df.groupby('product_names')['revenue']
        .sum()
        .reset_index()
        .sort_values('income', ascending=False)
        .head(top_n)
    )
    return top_products_df

def get_raw_data(df, start_date, end_date, class):
    """
    Return the uncooked (filtered) information with a income column.
    """
    filtered_df = filter_dataframe(df, start_date, end_date, class)
    if filtered_df.empty:
        return pd.DataFrame()
    
    filtered_df['revenue'] = filtered_df['price'] * filtered_df['quantity']
    filtered_df = filtered_df.sort_values(by=['order_date', 'order_id'])
    return filtered_df

def plot_data(information, x_col, y_col, title, xlabel, ylabel, orientation='v'):
    fig, ax = plt.subplots(figsize=(10, 6))
    if not information.empty:
        if orientation == 'v':
            ax.bar(information[x_col], information[y_col])
            plt.xticks(rotation=45)
        else:
            ax.barh(information[x_col], information[y_col])
        ax.set_title(title)
        ax.set_xlabel(xlabel)
        ax.set_ylabel(ylabel)
    else:
        ax.textual content(0.5, 0.5, "No information out there", ha='heart', va='heart')
    return fig

#########################################
# 3. Streamlit Software
#########################################

def fundamental():
    # Title
    st.title("Gross sales Efficiency Dashboard")

    # Load your information from S3
    # Substitute these together with your precise bucket title and object key
    bucket_name = "your_s3_bucket_name"
    object_key = "your_object_name"
    
    df = load_data_from_s3(bucket_name, object_key)
    
    # Get min and max date for default vary
    min_date, max_date = get_date_range(df)

    # Create UI for date and class filters
    with st.container():
        col1, col2, col3 = st.columns([1, 1, 2])
        start_date = col1.date_input("Begin Date", min_date)
        end_date = col2.date_input("Finish Date", max_date)
        classes = get_unique_categories(df)
        class = col3.selectbox("Class", ["All Categories"] + classes)

    # Customized CSS for metrics
    st.markdown("""
        
    """, unsafe_allow_html=True)

    # Fetch stats
    total_revenue, total_orders, avg_order_value, top_category = get_dashboard_stats(df, start_date, end_date, class)

    # Show key metrics
    metrics_html = f"""
    
        
            Whole Income
            ${total_revenue:,.2f}
        
        
            Whole Orders
            {total_orders:,}
        
        
            Common Order Worth
            ${avg_order_value:,.2f}
        
        
            Prime Class
            {top_category}
        
    
    """
    st.markdown(metrics_html, unsafe_allow_html=True)

    # Visualization Tabs
    st.header("Visualizations")
    tabs = st.tabs(["Revenue Over Time", "Revenue by Category", "Top Products"])

    # Income Over Time Tab
    with tabs[0]:
        st.subheader("Income Over Time")
        revenue_data = get_plot_data(df, start_date, end_date, class)
        st.pyplot(plot_data(revenue_data, 'date', 'income', "Income Over Time", "Date", "Income"))

    # Income by Class Tab
    with tabs[1]:
        st.subheader("Income by Class")
        category_data = get_revenue_by_category(df, start_date, end_date, class)
        st.pyplot(plot_data(category_data, 'classes', 'income', "Income by Class", "Class", "Income"))

    # Prime Merchandise Tab
    with tabs[2]:
        st.subheader("Prime Merchandise")
        top_products_data = get_top_products(df, start_date, end_date, class)
        st.pyplot(plot_data(top_products_data, 'product_names', 'income', "Prime Merchandise", "Income", "Product Title", orientation='h'))

    # Uncooked Knowledge
    st.header("Uncooked Knowledge")
    raw_data = get_raw_data(df, start_date, end_date, class)
    raw_data = raw_data.reset_index(drop=True)
    st.dataframe(raw_data, hide_index=True)

if __name__ == '__main__':
    fundamental()

Though it’s a reasonably chunky piece of code, I received’t clarify precisely what it does, as I’ve already coated that in some element in my beforehand referenced TDS article. I’ve included a hyperlink to the article on the finish of this one for many who wish to be taught extra.

So, assuming you may have a working Streamlit app that runs domestically with out points, listed here are the steps it’s good to take to deploy it to AWS.

Making ready our code for deployment

1/ Create a brand new folder in your native system to carry your code.

2/ In that folder, you’ll want three recordsdata and a sub-folder containing two extra recordsdata

File 1 is app.py — that is your fundamental Streamlit code file
File 2 is necessities.txt — this lists all exterior libraries your code must perform. Relying on what your code does, it’ll have at the very least one file referencing the Streamlit library. For my code, the file contained this,

streamlit
boto3
matplotlib
pandas

File 3 is named Procfile — this tells EB find out how to run your code. It’s contents ought to appear to be this

net: streamlit run app.py --server.port 8000 --server.enableCORS false

.ebextensions — this can be a subfolder which holds further recordsdata (see under)

3/ The .ebextensions subfolder has these two recordsdata.

It ought to have this content material:

option_settings:
  aws:elasticbeanstalk:atmosphere:proxy:
    ProxyServer: nginx

option_settings:
  aws:elasticbeanstalk:container:python:
    WSGIPath: app:fundamental

Be aware, though I didn’t want it for what I used to be doing, for completenes, you may optionally add a number of packages.config recordsdata beneath the .ebextensions subfolder that may comprise working system instructions which are run when the EC2 server begins up. For instance,

#
# 01_packages.config
#
packages:
    yum:
        amazon-linux-extras: []

instructions:
    01_postgres_activate:
        command: sudo amazon-linux-extras allow postgresql10
    02_postgres_install:
        command: sudo yum set up -y pip3
    03_postgres_install:
        command: sudo pip3 set up -y psycopg2

Upon getting all the required recordsdata, the following step is to zip them into an archive, preserving the folder and subfolder construction. You should use any instrument you want, however I exploit 7-Zip.

Deploying our code

Deployment is a multi-stage course of. First, log in to the AWS console, seek for “Elastic Beanstalk” within the providers search bar, and click on on the hyperlink. From there, you may click on the giant orange “Create Software” button. You’ll see the primary of round six screens, for which you will need to fill within the particulars. Within the following sections, I’ll describe the fields you will need to enter. Depart all the pieces else as it’s.

1/ Creating the appliance

That is simple: fill within the title of your utility and, optionally, its description.

2/ Configure Atmosphere

The atmosphere tier ought to be set to Internet Server.
Fill within the utility title.
For Platform kind, select Managed; for Platform, select Python, then resolve which model of Python you need to use. I used Python model 3.11.
Within the Software Code part, click on the Add your code possibility and comply with the directions. Kind in a model label, then click on ‘Native File’ or ‘S3 Add’, relying on the place your supply recordsdata are positioned. You need to add the one zip file we created earlier.
Select your occasion kind within the Presets part. I went for the Single occasion (free tier eligible). Then hit the Subsequent button.

3/ Configure Service Entry

Picture from AWS web site

For the Service function, you should use an present one in case you have it, or AWS will create one for you.
For the occasion profile function, you’ll in all probability have to create this. It simply must have the AWSElasticBeanstalkWebTier and AmazonS3ReadOnlyAccess insurance policies connected. Hit the Subsequent button.
I might additionally advise organising an EC2 key pair at this stage, as you’ll want it to log in to the EC2 server that EB creates in your behalf. This may be invaluable for investigating potential server points.

4/ Arrange networking, database and tags

Select your VPC. I had just one default VPC arrange. You even have the choice to create one right here when you don’t have already got one. Be sure that your VPC has at the very least one public subnet.
In Occasion Settings, I checked the Public IP Deal with possibility, and I selected to make use of my public subnets. Click on the Subsequent button.

5/ Configure the occasion and scaling

Underneath the EC2 Safety Teams part, I selected my default safety group. Underneath Occasion Kind, I opted for the t3.micro. Hit the Next button.

6/ Monitoring

Choose fundamental system well being monitoring
Uncheck the Managed Updates checkbox
Click on Subsequent

7/ Overview

Click on Create if all is OK

After this, it’s best to see a display screen like this,

Keep watch over the Occasions tab, as this can notify you if any points come up. For those who encounter issues, you should use the Logs tab to retrieve both a full set of logs or the final 100 strains of the deployment log, which can assist you debug any points.

After a couple of minutes, if all has gone nicely, the Well being label will swap from gray to inexperienced and your display screen will look one thing like this:

Now, it’s best to have the ability to click on on the Area URL (circled in purple above), and your dashboard ought to seem.

Troubleshooting

The very first thing to test when you encounter issues when operating your dashboard is that your supply information is within the appropriate location and is referenced appropriately in your Streamlit app supply code file. For those who rule that out as a difficulty, then you’ll greater than possible have hit a networking setup drawback, and also you’ll in all probability see a display screen like this.

If that’s the case, right here are some things you may try. Chances are you’ll have to log in to your EC2 occasion and evaluation the logs. In my case, I encountered a difficulty with my pip set up command, which ran out of house to put in all the required packages. To unravel that, I had so as to add additional Elastic Block storage to my occasion.

The extra possible trigger might be a networking situation. In that case, strive some or the entire options under.

VPC Configuration

Guarantee your Elastic Beanstalk atmosphere is deployed in a VPC with at the very least one public subnet.
Confirm that the VPC has an Web Gateway connected.

Subnet Configuration

Verify that the subnet utilized by your Elastic Beanstalk atmosphere is public.
Test that the “Auto-assign public IPv4 deal with” setting is enabled for this subnet.

Route Desk

Confirm that the route desk related together with your public subnet has a path to the Web Gateway (0.0.0.0/0 -> igw-xxxxxxxx).

Safety Group

Overview the inbound guidelines of the safety group connected to your Elastic Beanstalk situations.
Guarantee it permits incoming site visitors on port 80 (HTTP) and/or 443 (HTTPS) from the suitable sources.
Test that outbound guidelines permit crucial outgoing site visitors.

Community Entry Management Lists (NACLS)

Overview the Community ACLS related together with your subnet.
Guarantee they permit each inbound and outbound site visitors on the required ports.

Elastic Beanstalk Atmosphere Configuration

Confirm that your atmosphere is utilizing the proper VPC and public subnet within the Elastic Beanstalk console.

EC2 Occasion Configuration

Confirm that the EC2 situations launched by Elastic Beanstalk have public IP addresses assigned.

Load Balancer Configuration (if relevant)

For those who use a load balancer, guarantee it’s configured appropriately within the public subnet.
Test that the load balancer safety group permits incoming site visitors and may talk with the EC2 situations.

Securing your app

Because it stands, your deployed app is seen to anybody on the web who is aware of your deployed EB area title. That is in all probability not what you need. So, what are your choices for securing your app on AWS infrastructure?

1/ Lock the safety group to trusted CIDRs

Within the console, discover the safety group related together with your EB deployment and click on on it. It ought to appear to be this,

Ensure you’re on the Inbound Guidelines TAB, select Edit Inbound Guidelines, and alter the supply IP ranges to your company IP ranges or one other set of IP addresses.

2/ Use personal subnets, inner load balancers and NAT Gateways

This can be a more difficult choice to implement and certain requires the experience of your AWS community administrator or deployment specialist.

3/ Utilizing AWS Cognito and an utility load balancer

Once more, this can be a extra complicated setup that you just’ll in all probability want help with when you’re not an AWS community guru, however it’s maybe probably the most strong of all of them. The move is that this:-

A person navigates to your public Streamlit URL.

The ALB intercepts the request. It sees that the person is both not already logged in or not authenticated.

The ALB robotically redirects the person to Cognito to register or create an account. Upon profitable login, Cognito redirects the person again to your utility URL. The ALB now recognises a sound session and permits the request to proceed to your Streamlit app.

Your Streamlit app solely ever receives site visitors from authenticated customers.

Abstract

On this article, I mentioned deploying a Streamlit dashboard utility I had beforehand written to AWS. The unique app utilised PostgreSQL as its information supply, and I demonstrated find out how to swap to utilizing AWS S3 in preparation for deploying the app to AWS.

I mentioned deploying the app to AWS utilizing their Elastic Beanstalk service. I described and defined all the additional recordsdata required earlier than deployment, together with the necessity for them to be contained in a zipper archive.

I then briefly defined the Elastic Beanstalk service and described the detailed steps required to make use of it to deploy our Streamlit app to AWS infrastructure. I described the a number of enter screens that wanted to be navigated and confirmed what inputs to make use of at varied phases.

I highlighted some troubleshooting strategies if the app deployment doesn’t go as anticipated.

Lastly, I supplied some options on find out how to shield your app from unauthorised entry.

For extra info on Streamlit, try their on-line documention utilizing the hyperlink under.

https://docs.streamlit.io

To search out out extra about creating with Streamlit I present find out how to develop a contemporary information dashboard with it within the article linked under.

Source link

Candy AI NSFW AI Video Generator: My Unfiltered Thoughts

Starting Your First AI Stock Trading Bot

When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems

How This Entrepreneur Built a Bay Area Empire — One Hustle at a Time

I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

Amazon and eBay to pay ‘fair share’ for e-waste recycling

Artificial Intelligence Concerns & Predictions For 2025

Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

Most Popular

How to Maximize Technical Events — NVIDIA GTC Paris 2025

Scaling Machine Learning Pipelines with Pandas and PyArrow | by Hash Block | Jul, 2025

Missing Data in Time-Series: Machine Learning Techniques | by Sara Nóbrega | Dec, 2024

Our Picks