Homework 1.3: Validating image files (50 pts)

In live microscope imaging, we often take time courses. In one such experiment, I injected half-micron fluorescent beads into a stage 9 Drosophila oocyte (see Box 2 of this paper by Becalska and Gavis for an illustration). I performed this experiment on a Leica SP2 microscope at the ALMS at UCLA. The data from this experiment, as they were saved from the computer connected to the microscope, are in ZIP file https://s3.amazonaws.com/bebi103.caltech.edu/data/leica_tiffs.zip. (Be sure to put the ZIP file in the ../data/ directory and unzip it there so that the folder is ../data/leica_tiffs/.)

In this experiment, I first took a DIC image of the egg chamber, which is saved as stage9_Image003_ch00.tif. The remainder of the image files in the directory are from a time course of 125 images.

a) The naming convention the Leica software used for storing the images is

name_Series###_t##_z###_ch##.tif

Let’s break down the respective fields in file name, or, as podcasters annoyingly like to say preceded with a pause and followed by filler music, “Let’s unpack this.”

  • name is a user-suppled name of the images.

  • Series### gives the number of the series of images. In this case, it’s Series006.

  • The t## field gives the number of the image in the time series. For example, t00 is the first image, and t05 is the sixth. Importantly, though, the number of digits after the t can grow; t105 is the 106th image.

  • The z### gives the level of the z-stack. In this case, I was only imaging in one plane, so it’s z000 for all images.

  • The ch## is the number of the fluorescent channel. I used a single channel here, so it’s ch00 for all images.

Your first task in parsing this directory of images is to create a list of file names where the file names are in order of the time course. You should also identify any dropped (missing) frames.

b) The file stage9.txt contains metadata generated by the instrument about the images. Importantly, it has the time stamps of the images. In my analysis of these images, I was working under the assumption that the time between images was the same for each pair of successive images in the experiment. First, look at the metadata file so that you understand its structure and then programmatically parse out the time stamps and compute the time between frames to verify that my assumption about them being the same for all image pairs is valid.

Hint: This encoding of the metadata file is not Unicode. You will need to use the encoding keyword argument of open() to be able to read the file. This is likely CP 1252 encoded, so you can use encoding='cp1252'.

This is an example of data validation. We need to make sure the images are in the proper order and that the time stamps are appropriate. We will talk more about data validation, and maybe even have another homework problem on it, later in the term.

By not doing these things when I first did these experiments, I wasted over a month of time trying to figure out what was wrong with the microscope until I discovered I was not properly parsing the time points of the experiment. You get to learn from my mistake in this problem.