Tutorial 3: exercise

(c) 2018 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This document was prepared at Caltech with financial support from the Donna and Benjamin M. Rosen Bioengineering Center.

This tutorial exercise was generated from an Jupyter notebook. You can download the notebook here. Use this downloaded Jupyter notebook to fill out your responses.

Exercise 1

What is an ROI, and why is it an important concept in digital image processing?


Exercise 2

How would you expect each of the following to be distributed?

a) The amount of time between repressor-operator binding events.

b) The number of times a repressor binds its operator in a given hour.

c) The amount of time (in total minutes of baseball played) between no-hitters in Major League Baseball.

d) The number of no-hitters in a Major League Baseball season.

e) The winning times of the Belmont Stakes.

To answer this question, try to match these stories to the stories of named distributions. For those of you not familiar with baseball, a no-hitter is a game in which a team concedes no hits to the opposing team. There have only been a few hundred no-hitters in over 200,000 MLB games. The Belmont Stakes is a major horse race that has been run each year for over 150 years.


Exercise 3

Say I have three distributions:

  • Exponential, β=1
  • Gaussian, μ=1, σ=1
  • Cauchy, µ=1, σ=1

Say I draw numbers out of each of these distributions. Rank order the distributions, lowest to highest, in terms of how likely I am do draw a number greater than 10. You do not need to calculate anything to answer this question.


Exercise 4

This is not really an exercise, but a point of curiosity for me. Tell me of an instance, if any, where you were burned by a data set not being what you thought it was that you could have avoided if you did data validation.