HW 3.1: Beetle hypnotists (30 pts)

Data set download


The Parker lab at Caltech studies rove beetles that can infiltrate ant colonies. In one of their experiments, they place a rove beetle and an ant in a circular area and track the movements of the ants. They do this by using a deep learning algorithm to identify the head, thorax, abdomen, and right and left antennae. While deep learning applied to biological images is a beautiful and useful topic, we will not cover it in this course (be on the lookout for future courses that do!). We will instead work with a data set that is the output of the deep learning algorithm.

For the experiment you are considering in this problem, an ant and a beetle were placed in a circular arena and recorded with video at a frame rate of 28 frames per second. The positions of the body parts of the ant were tracked throughout the video recording. You can download the data set here. Pro tip: Pandas’s read_csv() function will automatically load in a zip file, so you do not need to unzip it. Be sure to use the comment='#' kwarg, though, since there are header comments on the top of the data file.

To save you from having to unzip and read the comments for the data file, here they are:

# This data set was kindly donated by Julian Wagner from Joe Parker's lab at
# Caltech. In the experiment, an ant and a beetle were placed in a circular
# arena and recorded with video at a frame rate of 28 frames per second.
# The positions of the body parts the ant are tracked throughout the video
# recording.
#
# The experiment aims to distinguish the ant behavior in the presence of
# a beetle from the genus Sceptobius, which secretes a chemical that modifies
# the behavior of the ant, versus in the presence of a beetle from the species
# Dalotia, which does not.
#
# The data set has the following columns.
#  frame : frame number from the video acquisition
#  beetle_treatment : Either dalotia or sceptobius
#  ID : The unique integer identifier of the ant in the experiment
#  bodypart : The body part being tracked in the experiment. Possible values
#             are head, thorax, abdomen, antenna_left, antenna_right.
#  x_coord : x-coordinate of the body part in units of pixels
#  y_coord : y-coordinate of the body part in units of pixels
#  likelihood : A rating, ranging from zero to one, given by the deep learning
#               algorithm that approximately quantifies confidence that the
#               body part was correctly identified.
#
# The interpixel distance for this experiment was 0.8 millimeters.

Your task in this problem is to extract records of interest out of the tidy data frame containing the data from the experiment, perform calculations on the data, and make informative plots.

a) The columns x_coord and y_coord give the coordinates of the ant’s body parts in units of pixels. Create a column 'x (mm)' and a column 'y (mm)' in the data frame that has the coordinates in units of millimeters. Also create a column 'time (sec)' that gives the time since recording started in minutes.

b) Make a plot displaying the position over time of the thorax of an ant or ants placed in an area with a Dalotia beetle and position over time of an ant or ants with a Sceptobius beetle. I am intentionally not giving more specification for your plot. You need to make decisions about how to effectively extract and display the data. Think carefully about your visualizations. This is in many ways how you let your data speak.

c) From this quick graphical exploratory analysis, what would you say about the relative activities of ants with Dalotia versus Sceptobius rove beetles?