(c) 2017 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.
This tutorial exercise was generated from an Jupyter notebook. You can download the notebook here.
What is maximum likelihood estimation in the Bayesian context? In what way is it "just a word?"
When performing parameter estimation by optimization, after we find the MAP, why do we locally approximate the posterior near the MAP as Gaussian?
Why couldn't we use scipy.optimize.leastsq()
on the Singer data from Tutorial 4?
You may have heard that curve fitting involved "minimizing the sum of the square of the residuals." If indeed finding the MAP is equivalent to minimizing the sum of the square of the residuals, what underlying assumptions are there in the statistical model?