Introduction to images

Dataset download


Note: To run this lecture, you will need to have the most up-to-date version of the bebi103 package. To get is, do

pip install --upgrade bebi103

on the command line.


[1]:
import numpy as np

import skimage.filters
import skimage.io
import skimage.morphology

import bebi103

import holoviews as hv
hv.extension('bokeh')

bebi103.hv.set_defaults()

import bokeh.io
bokeh.io.output_notebook()
Loading BokehJS ...

What is an image?

To answer this question, we will take a common sense approach and load in what we commonly think of as an image, a TIFF (tagged index file format) file. We will use scikit-image to load the image, and we will cover this package in depth in the coming weeks.

[2]:
# Load the image
im = skimage.io.imread('../data/HG104_bf_uneven_lighting.tif')

# What is it?
im
[2]:
array([[389, 398, 419, ..., 362, 373, 390],
       [387, 401, 407, ..., 378, 377, 361],
       [392, 402, 404, ..., 368, 367, 372],
       ...,
       [366, 389, 380, ..., 353, 363, 348],
       [383, 382, 377, ..., 355, 352, 363],
       [376, 387, 378, ..., 358, 356, 372]], dtype=uint16)

So, a digital image is an array of numbers. We have stored it as a NumPy array. What do these numbers represent?

Pixels

A pixel is a count of photons that struck a particular portion of a charge coupled device (CCD).

A pixel is not a square. A pixel has no width or height. Yes, typically the region of the CCD for which a pixel is recorded is square, but a photon can strike anywhere within that square, and the count for that pixel is incremented. We do not know where on the region of the CCD the photons struck, so we define the location of the pixel as the centroid of the CCD region (or the center of the square).

So, a pixel has a corresponding location on the CCD that can be indexed. Usually when we say “pixel,” we mean to specify the indexed location on the CCD and associated photon count.

The pixels are separated by a physical distance in the camera. Combined with the optics, this corresponds to a interpixel distance, which is the physical distance between pixels in a digital image. This is commonly referred to as pixel width, but there is no such thing as pixel width, so we will use the term interpixel distance. The interpixel distance for the image we just loaded, as calibrated for this microscope, is 63.6 nm, or 0.0636 µm.

[3]:
ip_distance = 0.0636 # µm

Importantly, from this look at pixels, we have learned what a digital image is. A digital image is data.

Bit depth

Notice that the image has dtype uint16. This means that it is an array of 16 bit integers. A bit is either a one or zero, and the bit depth is the number of bits necessary to represent all of the integer photon counts that the CCD in your device can detect. For a 16-bit image, then, the total number of possible pixel values is \(2^{16} = 65,536\). So, the pixel values range from 0 to 65,535.

Most cameras that you have used are 8-bit cameras. This is because the human eye cannot discern differences finer than about 200 intensity levels. Eight bits gives us 256 intensity levels.

However, images are data, so we need not be able to discern differences using our eyes.

The particular image we just loaded is actually not 16-bit. Let’s look at the maximum value.

[4]:
np.max(im)
[4]:
1023

The maximum value is 1023, which is the maximum value we can obtain with a 10-bit camera, since \(2^{10} = 1024\). It is stored as a 16-bit integer because NumPy only supports 8-bit and 16-bit data types for integers, in addition to Python’s built in long integers.

“Plotting” images

An image is data, and that’s how we should view them. Sometimes, it is useful to look at an image. This “looking” different from looking at pictures of your family or the Grand Canyon. When we look at scientific digital images, we are plotting them. As with any other data, there are many ways to plot them. We’ll start with a plot that is probably most familiar to you; just looking at the grayscale image. We don’t even need Python to do this; we can use Markdown.

HG105_bf_uneven_lighting

Wait a minute! That looks like blackness. That is because this is a 10 bit image, and HTML natively only views 8 bit or 16 bit images. So, the 10 bit image was converted to a 16 bit image, and the pixels are all dark.

In general, you need not really bother looking at images natively on your operating system, especially if they vary in bit depth. Instead, let’s use HoloViews to view the image.

[5]:
hv.Image(im)
[5]:

We recognize the image as bacteria along with some illumination and noise problems. This is better, but a bit confusing because of the axes and the color. With respect to the axes, HoloViews by default has the axes going from -1/2 to 1/2. We would rather have the axes marked in units of microns so we know the physical distances. We can specify that with the bounds kwarg for hv.Image(), which consists of a list of numerical values to use for the [left, bottom, right, top] of the image. We can make that for this image using our value of the interpixel distance.

[6]:
bounds=([0, 0, im.shape[1]*ip_distance, im.shape[0]*ip_distance])

hv.Image(
    data=im,
    bounds=bounds
).opts(
    xlabel='µm',
    ylabel='µm',
)
[6]:

Note that the units of the axes are now in physical space units. This is better than a scale bar. No matter how you zoom in on the image, axes are always present, so you always know the physical size of the image.

The default HoloViews aspect ratio does not result in square pixels. We can fix this be manually setting the aspect option to be the ratio of the width to height of the image in pixels.

[7]:
hv.Image(
    data=im,
    bounds=bounds
).opts(
    aspect=im.shape[1]/im.shape[0],
    xlabel='µm',
    ylabel='µm',
)
[7]:

Now we have a nicely plotted image. Now let’s discuss the coloring.

False coloring (lookup table)

A lookup table is a specification of color for a given pixel value. In HoloView’s default, a magma color map is used connecting pixel intensity values to color; here we use Viridis, which is the default set in our call to bebi103.hv.set_defaults().

We can define whatever lookup table we like, but we should use perceptual color maps. We can even use gray scale if we like, though it is harder for the human eye to parse than color. Again, we are mapping color to quantitative data. So, there really is nothing false at all about “false” coloring.

[8]:
hv.Image(
    data=im,
    bounds=bounds
).opts(
    aspect=im.shape[1]/im.shape[0],
    cmap='gray',
    xlabel='µm',
    ylabel='µm',
)
[8]:

Quick image plots with bebi103

The bebi103.image module contains useful utilities for image processing, including quickly plotting images. We can make the same plot (with default Viridis lookup table) using that module.

[9]:
p = bebi103.image.imshow(
    im=im,
    interpixel_distance=ip_distance,
    length_units='µm')
bokeh.io.show(p)

Here, we see that most pixels are blue, and there are a few yellow ones. This is indicative of salt and pepper noise, where certain pixels have very high intensities. This is easier seen if we plot the image in yet another way.

Histograms

The histogram of an image is just a histogram of pixel intensity counts. The counts are acquired using skimage.exposure.histogram(), and the result is conveniently displayed as spikes. It is another useful way of plotting image data.

[10]:
# Get the histogram data
counts, pixel_vals = skimage.exposure.histogram(im)

# Plot using spikes
hv.Spikes(
    data=(pixel_vals, counts),
    kdims=['pixel values'],
    vdims=['count'],
)
[10]:

We see that most pixels lie below 400, and we have a few above 1000. Similarly, we have some pixels with values below 175.

Basic image filtering

We might suspect that the pixel values lying outside of the main peak of the histogram are due to salt and pepper noise. We can filter this out by doing a median filter. The concept here is simple. For every 3$:nbsphinx-math:`times`$3 square subregion of the image, replace the center pixel by the median of the subregion.

Remember, an image is data, and our goal is to interpret these data. By filtering, we are helping with this interpretation.

[11]:
# Make the structuring element
selem = skimage.morphology.square(3)

# Perform the median filter
im_filt = skimage.filters.median(im, selem)

# Show filtered image
p = bebi103.image.imshow(
    im_filt,
    interpixel_distance=ip_distance,
    length_units='µm'
)
bokeh.io.show(p)

What a difference! Further, the effects of the uneven illumination are even more pronounced. We can deal with this by background subtraction. To do this, we take a moving average over the image and subtract that average from the whole image. This process is called blurring. We will take a Gaussian-weighted average, performing a Gaussian blur.

[12]:
# Convert the uneven image to floating point
im_float = im_filt / im_filt.max()

# Strongly blur the image with a Gaussian filter with stdev of 20 pixels
im_blur = skimage.filters.gaussian(im_float, 20)

# Subtract the blurred image from the original
im_new = im_float - im_blur

# Display images in false color
p = bebi103.image.imshow(
    im_new,
    interpixel_distance=ip_distance,
    length_units='µm')
bokeh.io.show(p)

Other filtering methods

There are lots and lots of ways to filter images, but we will not go into those here. The obscenely priced book by Gonzales and Woods is a good reference for these methods. I do not encourage using it, though, because its price is offensive. There are plenty of good resources online.

What is an image more precisely?

We have talked about the end product of an image acquisition, a digital image. We have covered some basic operations of image display and filtering. But we still have not precisely defined what we mean by “image.” We just said that “an image is data,” but more precisely, a digital image is data. Let’s try for a more precise definition.

There are three things we commonly call “image.” We will only call one of them “image” and the other two will get different names.

  1. Digital image: What we have just described, the digital image.

  2. Image: What actually comes out of the optics before the CCD detector. This is what we are trying to store as a digital image.

  3. Object function: The scene. That is, the way light is reflected or emitted from an object. This is what we use the optics of our microscope, telescope, camera, etc., to collect.

We have already discussed how the digital image is acquired from the image. How are the image and the object function related?

\begin{align} \text{Image} = \text{PSF} * \text{object function} + \text{noise}. \end{align}

PSF stands for point spread function, which is the way information on the object function is spread as a result of recording data. This is deterministic and characteristic of the imaging system. Noise is nondeterministic and can, at best, be described with a statistical distribution. Finally, the \(*\) operator is convolution. If \(g(x,y)\) is the noiseless image, \(f(x',y')\) is the object function, and \(h(x, y;x', y')\) is the point spread function,

\begin{align} g(x,y) = (f*h)(x, y) = \int_{-\infty}^\infty \mathrm{d}x'\,\mathrm{d}y'\, f(x',y')\,h(x,y;x',y'). \end{align}

We can get a better feel for this with an illustration.

Image from https://en.wikipedia.org/wiki/Point_spread_function#/media/File:Convolution_Illustrated_eng.png, freely licensed.

If we can independently measure the point spread function exactly, we know \(h(x,y;x',y')\) and can perform a deconvolution, where we recover something close to the object function from an image. PSFs are usually measured for a microscope with a point source from, e.g., a quantum dot. In many cases, deconvolution is done automatically in the acquisition software of the instrument.

Image processing in BE/Bi 103

We will learn some basic image processing techniques in this class. In particular, we will learn segmentation techniques, in which objects of interest are separated from the background. We will use more “classic” methods, like filtering, (adaptive) thresholding, edge detection, etc., though we will omit other useful methods like wavelet-based methods, watershed algorithms, optical flow, etc. Importantly, we will omit machine learning-based methods that have now become the state of the art. Nonetheless, having some basic image processing skills can be a very large first step in extracting useful information from images, as you will see as you work through homework problems dealing with image processing.

Computing environment

[ ]:
%load_ext watermark
%watermark -v -p numpy,skimage,bokeh,holoviews,bebi103,jupyterlab