BE/Bi 103 b: Statistical Inference in the Biological Sciences¶
In the prequel to this course, we developed tools to build data analysis pieplines, including the organization, preservation, sharing, and display quantitative data. We also learned basic techniques in statistical inference using resampling methods taking a frequentist approach.
In this class, we go deeper into statistical modeling and inference, mostly taking a Bayesian approach. We discuss generative modeling, parameter estimation, model comparison, hierarchical modeling, Markov chain Monte Carlo, graphical display of inference results, and principled workflows. All of these topics are explored through analysis of real biological data sets.
If you are enrolled in the course, please read the Course policies below. We will not go over them in detail in class, and it is your responsibility to understand them.
Useful links¶
Ed (used for course communications)
Course Zoom link (password protected)
“200 Broad” Gather link (password protected)
Video recordings (password protected)
Google doc for help queue (password protected)
Homework solutions (password protected)
During lab and homework help sessions, we will break out into different Zoom sessions headed by various course staff members. Their individual Zoom links are accessible below.
People¶
Instructor
Justin Bois (bois at caltech dot edu)
TAs
Rosita Fu (rfu at caltech dot edu)
Tom Röschinger (troeschi at caltech dot edu)
Ariana Tribby (atribby at caltech dot edu)
Julian Wagner (jwagner2 at caltech dot edu)
- 0. Preparing for the course
- 1. Probability and the logic of scientific reasoning
- 2. Plotting posteriors
- 3. Marginalization by numerical quadrature
- 4. Conjugacy
- E1. To be completed after lesson 4
- 5. Introduction to Bayesian modeling
- 6. Parameter estimation by optimization
- E2. To be completed after lesson 6
- 7. Introduction to Markov chain Monte Carlo
- 8. AWS setup and usage
- 9. Introduction to MCMC with Stan
- 10. Mixture models and label switching with MCMC
- 11. Regression with MCMC
- E3. To be completed after lesson 11
- 12. Display of MCMC results
- 13. Model building with prior predictive checks
- 14. Posterior predictive checks
- E4. To be completed after lesson 14
- 15. Collector’s box of distributions
- 16. MCMC diagnostics
- 17. A diagnostics case study: Artificial funnel of hell
- E5. To be completed after lesson 17
- 18. Model comparison
- 19. Model comparison in practice
- E6. To be completed after lesson 19
- 20. Hierarchical models
- 21. Implementation of hierarchical models
- E7. To be completed after lesson 21
- 22. Principled analysis pipelines
- 23: Simulation based calibration and related checks in practice
- E8. To be completed after lesson 23
- 24. Introduction to Gaussian processes
- 25. Implementation of Gaussian processes
- E9. To be completed after lesson 25
- 26: Variational Bayesian inference
- 27: Wrap-up
- R1. Review of MLE
- R2: Review of probability
- R3. Choosing priors
- R4. Stan installation and use of AWS
- R5. A Bayesian modeling case study: Ant traffic jams
- R6. Practice model building
- R7. Introduction to Hamiltonian Monte Carlo
- R8: Discussion of HW 10 project proposals
- R9: Sampling discrete parameters with Stan
- 0. Configuring your team
- 1. Intuitive generative modeling
- 2. Analytical and graphical methods for analysis of the posterior
- 3. Maximum a posteriori parameter estimation
- 4. Sampling with MCMC
- 5. Inference with Stan I
- 6. Practice building and assessing Bayesian models
- 7. Model comparison
- 8. Hierarchical models
- 9. Principled pipelines and hierarchical modeling of noise
- 10. The grand finale
- 11. Course feedback