(c) 2017 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained therein is licensed under an MIT license.
This homework was generated from an Jupyter notebook. You can download the notebook here.
# Our numerical workhorses
import numpy as np
# Scikit-image submodules
import skimage.filters
import skimage.io
import skimage.morphology
import bebi103
import bokeh.io
bokeh.io.output_notebook()
To answer this question, we will take a common sense approach and load in what we commonly think of as an image, a TIFF (tagged index file format) file. We will use scikit-image
to load the image, and we will cover this package in depth in the coming weeks.
# Load the image
im = skimage.io.imread('../data/HG104_bf_uneven_lighting.tif')
# What is it?
im
So, a digital image is an array of numbers. We have stored it as a NumPy array. What do these numbers represent?
A pixel is a count of photons that struck a particular portion of a charge coupled device (CCD).
A pixel is not a square. A pixel has no width or height. Yes, typically the region of the CCD for which a pixel is recorded is square, but a photon can strike anywhere within that square, and the count for that pixel is incremented. We do not know where on the region of the CCD the photons struck, so we define the location of the pixel as the centroid of the CCD region (or the center of the square).
So, a pixel has a corresponding location on the CCD that can be indexed. Usually when we say "pixel," we mean to specify the indexed location on the CCD and associated photon count.
The pixels are separated by a physical distance in the camera. Combined with the optics, this corresponds to a interpixel distance, which is the physical distance between pixels in a digital image. This is commonly referred to as pixel width, but there is no such thing as pixel width, so we will use the term interpixel distance. The interpixel distance for the image we just loaded, as calibrated for this microscope, is 63.6 nm, or 0.0636 µm.
interpixel_distance = 0.0636 #µm
Importantly, from this look at pixels, we have learned what a digital image is. A digital image is data.
Notice that the image as dtype
uint16
. This means that it is an array of 16 bit integers. A bit is either a one or zero, and the bit depth is the number of bits necessary to represent all of the integer photon counts that the CCD in your device can detect. For a 16-bit image, then, the total number of possible pixel values is $2^{16} = 65,536$. So, the pixel values range from 0 to 65,535.
Most cameras that you have used as 8-bit cameras. This is because the human eye cannot discern differences finer than about 200 intensity levels. Eight bits gives us 256 intensity levels.
However, images are data, so we need not be able to discern differences using our eyes.
The particular image we just loaded is actually not 16-bit. Let's look at the maximum value.
np.max(im)
The maximum value is 1023, which is the maximum value we can obtain with a 10-bit camera, since $2^{10} = 1024$. It is stored as a 16-bit integer because NumPy only supports 8-bit and 16-bit data types for integers, in addition to Python's built in long integers.
An image is data, and that's how we should view them. Sometimes, it is useful to look at an image. This "looking" different from looking at pictures of your family or the Grand Canyon. When we look at scientific digital images, we are plotting them. As with any other data, there are many ways to plot them. We'll start with a plot that is probably most familiar to you; just looking at the grayscale image.
# Set up color mapper to be gray scale
color_mapper = bokeh.models.LinearColorMapper(bokeh.palettes.gray(256))
# Create plot
p = bebi103.viz.imshow(im,
color_mapper=color_mapper,
interpixel_distance=interpixel_distance,
length_units='µm')
bokeh.io.show(p)
We recognize the image as bacteria along with some illumination and noise problems. But, this is not the only way to plot the image!
A lookup table is a specification of color for a given pixel value. In the first plot we made of the image, we used a lookup table where pixels with high intensity were displayed as white and those with low intensity as black. We can define whatever lookup table we like. For example, we could make high pixels yellow and low pixels blue, with intermediate ones being green. This is accomplished through the great perceptual colormap Viridis. This is actually the default lookup table of the bebi103.viz.imshow()
function.
p = bebi103.viz.imshow(im,
interpixel_distance=interpixel_distance,
length_units='µm')
bokeh.io.show(p)
Here, we see that most pixels are blue, and there are a few yellow ones. This is indicative of salt and pepper noise, where certain pixels have very high intensities. This is easier seen if we plot the image in yet another way.
The histogram of an image is just a histogram of pixel intensity counts. It is another useful way of looking at images.
# Get the histogram data
hist, bins = skimage.exposure.histogram(im)
# Show the plot as a fill-between
p = bebi103.viz.fill_between(bins, hist, [0, bins.max()], [0, 0], fill_alpha=0.3)
bokeh.io.show(p)
We see that most pixels lie below 400, and we have a few above 1000. Similarly, we have some pixels with values below 175.
We might suspect that the pixel values lying outside of the main peak of the histogram are due to salt and pepper noise. We can filter this out by doing a median filter. The concept here is simple. For every 3$\times$3 square subregion of the image, replace the center pixel by the median of the subregion.
Remember, an image is data, and our goal is to interpret these data. By filtering, we are helping with this interpretation.
# Make the structuring element
selem = skimage.morphology.square(3)
# Perform the median filter
im_filt = skimage.filters.median(im, selem)
# Show filtered image
p = bebi103.viz.imshow(im_filt,
interpixel_distance=interpixel_distance,
length_units='µm')
bokeh.io.show(p)