MLE of Normal parameters¶
[1]:
# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
cmd = "pip install --upgrade watermark"
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
data_path = "https://s3.amazonaws.com/bebi103.caltech.edu/data/"
else:
data_path = "../data/"
# ------------------------------
import numpy as np
import pandas as pd
Recall our first model for spindle length.
\begin{align} l_i \sim \text{Norm}(\phi, \sigma) \;\;\forall i. \end{align}
For this model, we have already worked out analytically that the maximum likelihood estimates for the parameters \(\phi\) and \(\sigma\) are given by their respective plug-in estimates.
\begin{align} &\phi^* = \hat{\phi}\\[1em] &\sigma^* = \hat{\sigma}. \end{align}
We can directly compute these and use parametric bootstrap to get their confidence intervals. We start by loading in the data and computing the MLEs for \(\phi\) and \(\sigma\).
[2]:
df = pd.read_csv(os.path.join(data_path, 'good_invitro_droplet_data.csv'), comment='#')
spindle_length = df['Spindle Length (um)'].values
phi_mle = np.mean(spindle_length)
sigma_mle = np.std(spindle_length)
print('φ MLE:', phi_mle, 'µm')
print('σ MLE:', sigma_mle, 'µm')
φ MLE: 32.86402985074626 µm
σ MLE: 4.784665043782949 µm
Now we use parametric bootstrap to compute the confidence intervals. First, we’ll write functions to perform the bootstrapping.
[3]:
rg = np.random.default_rng()
def draw_parametric_bs_reps(data, phi, sigma, size=1):
"""Parametric bootstrap replicates of parameters of
Normal distribution."""
bs_reps_phi = np.empty(size)
bs_reps_sigma = np.empty(size)
for i in range(size):
bs_sample = np.random.normal(phi, sigma, size=len(data))
bs_reps_phi[i] = np.mean(bs_sample)
bs_reps_sigma[i] = np.std(bs_sample)
return bs_reps_phi, bs_reps_sigma
Now we can get the bootstrap replicates.
[4]:
bs_reps_phi, bs_reps_sigma = draw_parametric_bs_reps(
spindle_length, phi_mle, sigma_mle, size=100000
)
With the bootstrap replicates in hand, we can compute the 95% confidence intervals using percentiles.
[5]:
print('φ conf int:', np.percentile(bs_reps_phi, [2.5, 97.5]))
print('σ conf int:', np.percentile(bs_reps_sigma, [2.5, 97.5]))
φ conf int: [32.50257673 33.22919193]
σ conf int: [4.5262453 5.03866745]
So, we have estimated the parameters under this model.
Computing environment¶
[6]:
%load_ext watermark
%watermark -v -p numpy,pandas,jupyterlab
CPython 3.8.5
IPython 7.19.0
numpy 1.19.2
pandas 1.1.3
jupyterlab 2.2.6