Close Menu
    Trending
    • How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1
    • From Training to Drift Monitoring: End-to-End Fraud Detection in Python | by Aakash Chavan Ravindranath, Ph.D | Jul, 2025
    • Using Graph Databases to Model Patient Journeys and Clinical Relationships
    • Cuba’s Energy Crisis: A Systemic Breakdown
    • AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Method of Moments Estimation with Python Code | by Mahmoud Abdelaziz, PhD | Jan, 2025
    Artificial Intelligence

    Method of Moments Estimation with Python Code | by Mahmoud Abdelaziz, PhD | Jan, 2025

    Team_AIBS NewsBy Team_AIBS NewsJanuary 9, 2025No Comments10 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Learn how to perceive and implement the estimator from scratch

    Towards Data Science

    Photograph by Petr Macháček on Unsplash

    Let’s say you might be in a buyer care middle, and also you wish to know the chance distribution of the variety of calls per minute, or in different phrases, you wish to reply the query: what’s the chance of receiving zero, one, two, … and many others., calls per minute? You want this distribution with the intention to predict the chance of receiving totally different variety of calls primarily based on which you’ll plan what number of workers are wanted, whether or not or not an enlargement is required, and many others.

    As a way to let our determination ‘knowledge knowledgeable’ we begin by gathering knowledge from which we attempt to infer this distribution, or in different phrases, we wish to generalize from the pattern knowledge to the unseen knowledge which is also referred to as the inhabitants in statistical phrases. That is the essence of statistical inference.

    From the collected knowledge we are able to compute the relative frequency of every worth of calls per minute. For instance, if the collected knowledge over time appears one thing like this: 2, 2, 3, 5, 4, 5, 5, 3, 6, 3, 4, … and many others. This knowledge is obtained by counting the variety of calls acquired each minute. As a way to compute the relative frequency of every worth you possibly can rely the variety of occurrences of every worth divided by the entire variety of occurrences. This fashion you’ll find yourself with one thing just like the gray curve within the beneath determine, which is equal to the histogram of the information on this instance.

    Picture generated by the Creator

    Another choice is to imagine that every knowledge level from our knowledge is a realization of a random variable (X) that follows a sure chance distribution. This chance distribution represents all of the potential values which might be generated if we have been to gather this knowledge lengthy into the longer term, or in different phrases, we are able to say that it represents the inhabitants from which our pattern knowledge was collected. Moreover, we are able to assume that each one the information factors come from the identical chance distribution, i.e., the information factors are identically distributed. Furthermore, we assume that the information factors are impartial, i.e., the worth of 1 knowledge level within the pattern will not be affected by the values of the opposite knowledge factors. The independence and equivalent distribution (iid) assumption of the pattern knowledge factors permits us to proceed mathematically with our statistical inference downside in a scientific and simple approach. In additional formal phrases, we assume {that a} generative probabilistic mannequin is chargeable for producing the iid knowledge as proven beneath.

    Picture generated by the Creator

    On this explicit instance, a Poisson distribution with imply worth λ = 5 is assumed to have generated the information as proven within the blue curve within the beneath determine. In different phrases, we assume right here that we all know the true worth of λ which is usually not recognized and must be estimated from the information.

    Picture generated by the Creator

    Versus the earlier technique during which we needed to compute the relative frequency of every worth of calls per minute (e.g., 12 values to be estimated on this instance as proven within the gray determine above), now we solely have one parameter that we goal at discovering which is λ. One other benefit of this generative mannequin strategy is that it’s higher by way of generalization from pattern to inhabitants. The assumed chance distribution may be stated to have summarized the information in a chic approach that follows the Occam’s razor precept.

    Earlier than continuing additional into how we goal at discovering this parameter λ, let’s present some Python code first that was used to generate the above determine.

    # Import the Python libraries that we are going to want on this article
    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
    import seaborn as sns
    import math
    from scipy import stats

    # Poisson distribution instance
    lambda_ = 5
    sample_size = 1000
    data_poisson = stats.poisson.rvs(lambda_,measurement= sample_size) # generate knowledge

    # Plot the information histogram vs the PMF
    x1 = np.arange(data_poisson.min(), data_poisson.max(), 1)
    fig1, ax = plt.subplots()
    plt.bar(x1, stats.poisson.pmf(x1,lambda_),
    label="Possion distribution (PMF)",shade = BLUE2,linewidth=3.0,width=0.3,zorder=2)
    ax.hist(data_poisson, bins=x1.measurement, density=True, label="Information histogram",shade = GRAY9, width=1,zorder=1,align='left')

    ax.set_title("Information histogram vs. Poisson true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    plt.savefig("Possion_hist_PMF.png", format="png", dpi=800)

    Our downside now could be about estimating the worth of the unknown parameter λ utilizing the information we collected. That is the place we’ll use the technique of moments (MoM) strategy that seems within the title of this text.

    First, we have to outline what is supposed by the second of a random variable. Mathematically, the kth second of a discrete random variable (X) is outlined as follows

    Take the primary second E(X) for example, which can be the imply μ of the random variable, and assuming that we gather our knowledge which is modeled as N iid realizations of the random variable X. An inexpensive estimate of μ is the pattern imply which is outlined as follows

    Thus, with the intention to acquire a MoM estimate of a mannequin parameter that parametrizes the chance distribution of the random variable X, we first write the unknown parameter as a operate of a number of of the kth moments of the random variable, then we exchange the kth second with its pattern estimate. The extra unknown parameters we’ve in our fashions, the extra moments we’d like.

    In our Poisson mannequin instance, that is quite simple as proven beneath

    Within the subsequent half, we take a look at our MoM estimator on the simulated knowledge we had earlier. The Python code for acquiring the estimator and plotting the corresponding chance distribution utilizing the estimated parameter is proven beneath.

    # Technique of moments estimator utilizing the information (Poisson Dist)
    lambda_hat = sum(data_poisson) / len(data_poisson)

    # Plot the MoM estimated PMF vs the true PMF
    x1 = np.arange(data_poisson.min(), data_poisson.max(), 1)
    fig2, ax = plt.subplots()
    plt.bar(x1, stats.poisson.pmf(x1,lambda_hat),
    label="Estimated PMF",shade = ORANGE1,linewidth=3.0,width=0.3)
    plt.bar(x1+0.3, stats.poisson.pmf(x1,lambda_),
    label="True PMF",shade = BLUE2,linewidth=3.0,width=0.3)

    ax.set_title("Estimated Poisson distribution vs. true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    #ax.grid()
    plt.savefig("Possion_true_vs_est.png", format="png", dpi=800)

    The beneath determine exhibits the estimated distribution versus the true distribution. The distributions are fairly shut indicating that the MoM estimator is an affordable estimator for our downside. The truth is, changing expectations with averages within the MoM estimator implies that the estimator is a constant estimator by the regulation of enormous numbers, which is an efficient justification for utilizing such estimator.

    Picture generated by the Creator

    One other MoM estimation instance is proven beneath assuming the iid knowledge is generated by a standard distribution with imply μ and variance σ² as proven beneath.

    Picture generated by the Creator

    On this explicit instance, a Gaussian (regular) distribution with imply worth μ = 10 and σ = 2 is assumed to have generated the information. The histogram of the generated knowledge pattern (pattern measurement = 1000) is proven in gray within the beneath determine, whereas the true distribution is proven within the blue curve.

    Picture generated by the Creator

    The Python code that was used to generate the above determine is proven beneath.

    # Regular distribution instance
    mu = 10
    sigma = 2
    sample_size = 1000
    data_normal = stats.norm.rvs(loc=mu, scale=sigma ,measurement= sample_size) # generate knowledge

    # Plot the information histogram vs the PDF
    x2 = np.linspace(data_normal.min(), data_normal.max(), sample_size)
    fig3, ax = plt.subplots()
    ax.hist(data_normal, bins=50, density=True, label="Information histogram",shade = GRAY9)
    ax.plot(x2, stats.norm(loc=mu, scale=sigma).pdf(x2),
    label="Regular distribution (PDF)",shade = BLUE2,linewidth=3.0)

    ax.set_title("Information histogram vs. true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    ax.grid()

    plt.savefig("Normal_hist_PMF.png", format="png", dpi=800)

    Now, we wish to use the MoM estimator to search out an estimate of the mannequin parameters, i.e., μ and σ² as proven beneath.

    As a way to take a look at this estimator utilizing our pattern knowledge, we plot the distribution with the estimated parameters (orange) within the beneath determine, versus the true distribution (blue). Once more, it may be proven that the distributions are fairly shut. In fact, with the intention to quantify this estimator, we have to take a look at it on a number of realizations of the information and observe properties corresponding to bias, variance, and many others. Such vital elements have been mentioned in an earlier article Bias Variance Tradeoff in Parameter Estimation with Python Code | by Mahmoud Abdelaziz, PhD | Medium

    Picture generated by the Creator

    The Python code that was used to estimate the mannequin parameters utilizing MoM, and to plot the above determine is proven beneath.

    # Technique of moments estimator utilizing the information (Regular Dist)
    mu_hat = sum(data_normal) / len(data_normal) # MoM imply estimator
    var_hat = sum(pow(x-mu_hat,2) for x in data_normal) / len(data_normal) # variance
    sigma_hat = math.sqrt(var_hat) # MoM customary deviation estimator

    # Plot the MoM estimated PDF vs the true PDF
    x2 = np.linspace(data_normal.min(), data_normal.max(), sample_size)
    fig4, ax = plt.subplots()
    ax.plot(x2, stats.norm(loc=mu_hat, scale=sigma_hat).pdf(x2),
    label="Estimated PDF",shade = ORANGE1,linewidth=3.0)
    ax.plot(x2, stats.norm(loc=mu, scale=sigma).pdf(x2),
    label="True PDF",shade = BLUE2,linewidth=3.0)

    ax.set_title("Estimated Regular distribution vs. true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    ax.grid()
    plt.savefig("Normal_true_vs_est.png", format="png", dpi=800)

    One other helpful chance distribution is the Gamma distribution. An instance for the appliance of this distribution in actual life was mentioned in a earlier article. Nonetheless, on this article, we derive the MoM estimator of the Gamma distribution parameters α and β as proven beneath, assuming the information is iid.

    Picture generated by the Creator

    On this explicit instance, a Gamma distribution with α = 6 and β = 0.5 is assumed to have generated the information. The histogram of the generated knowledge pattern (pattern measurement = 1000) is proven in gray within the beneath determine, whereas the true distribution is proven within the blue curve.

    Picture generated by the Creator

    The Python code that was used to generate the above determine is proven beneath.

    # Gamma distribution instance
    alpha_ = 6 # form parameter
    scale_ = 2 # scale paramter (lamda) = 1/beta in gamma dist.
    sample_size = 1000
    data_gamma = stats.gamma.rvs(alpha_,loc=0, scale=scale_ ,measurement= sample_size) # generate knowledge

    # Plot the information histogram vs the PDF
    x3 = np.linspace(data_gamma.min(), data_gamma.max(), sample_size)
    fig5, ax = plt.subplots()
    ax.hist(data_gamma, bins=50, density=True, label="Information histogram",shade = GRAY9)
    ax.plot(x3, stats.gamma(alpha_,loc=0, scale=scale_).pdf(x3),
    label="Gamma distribution (PDF)",shade = BLUE2,linewidth=3.0)

    ax.set_title("Information histogram vs. true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    ax.grid()
    plt.savefig("Gamma_hist_PMF.png", format="png", dpi=800)

    Now, we wish to use the MoM estimator to search out an estimate of the mannequin parameters, i.e., α and β, as proven beneath.

    As a way to take a look at this estimator utilizing our pattern knowledge, we plot the distribution with the estimated parameters (orange) within the beneath determine, versus the true distribution (blue). Once more, it may be proven that the distributions are fairly shut.

    Picture generated by the Creator

    The Python code that was used to estimate the mannequin parameters utilizing MoM, and to plot the above determine is proven beneath.

    # Technique of moments estimator utilizing the information (Gamma Dist)
    sample_mean = data_gamma.imply()
    sample_var = data_gamma.var()
    scale_hat = sample_var/sample_mean #scale is the same as 1/beta in gamma dist.
    alpha_hat = sample_mean**2/sample_var

    # Plot the MoM estimated PDF vs the true PDF
    x4 = np.linspace(data_gamma.min(), data_gamma.max(), sample_size)
    fig6, ax = plt.subplots()

    ax.plot(x4, stats.gamma(alpha_hat,loc=0, scale=scale_hat).pdf(x4),
    label="Estimated PDF",shade = ORANGE1,linewidth=3.0)
    ax.plot(x4, stats.gamma(alpha_,loc=0, scale=scale_).pdf(x4),
    label="True PDF",shade = BLUE2,linewidth=3.0)

    ax.set_title("Estimated Gamma distribution vs. true distribution", fontsize=14, loc='left')
    ax.set_xlabel('Information worth')
    ax.set_ylabel('Likelihood')
    ax.legend()
    ax.grid()
    plt.savefig("Gamma_true_vs_est.png", format="png", dpi=800)

    Word that we used the next equal methods of writing the variance when deriving the estimators within the instances of Gaussian and Gamma distributions.

    On this article, we explored varied examples of the strategy of moments estimator and its functions in several issues in knowledge science. Furthermore, detailed Python code that was used to implement the estimators from scratch in addition to to plot the totally different figures can be proven. I hope that you will see this text useful.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleFrom Stars to Petabytes: The Data Science Revolution in Astronomy | by Ratchanon Pankalasin | Special Topics in Data Science | Jan, 2025
    Next Article Kevin O’Leary Teams Up With Frank McCourt for TikTok Bid
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1

    July 1, 2025
    Artificial Intelligence

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025
    Artificial Intelligence

    Implementing IBCS rules in Power BI

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Anthropic CEO Predicts AI Will Take Over Coding in 12 Months

    March 15, 2025

    Using Diffusion Models for BeliefMDPs | by Aritra Chakrabarty | Toward Humanoids | Dec, 2024

    December 20, 2024

    As of late 2024, the cryptocurrency market has been experiencing several notable trends: | by Nishant Palsaniya | Dec, 2024

    December 28, 2024
    Our Picks

    How to Access NASA’s Climate Data — And How It’s Powering the Fight Against Climate Change Pt. 1

    July 1, 2025

    From Training to Drift Monitoring: End-to-End Fraud Detection in Python | by Aakash Chavan Ravindranath, Ph.D | Jul, 2025

    July 1, 2025

    Using Graph Databases to Model Patient Journeys and Clinical Relationships

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.