Meetings
We have weekly lectures Wednesday mornings from 10-10:50 AM in 100 Broad. On Mondays, you may attend one of the two lab sessions, either 1-4pm or 7-10pm, both in 328 SFL. Bring your laptop and charger. You should always attend the same lab session (either 1pm or 7pm), unless you have a conflict and let the course instructors know.
Attendance at lecture and participation in lab sessions are mandatory; 20% of your grade depends on it.
Lab sessions
In each three-hour lab session, you are expected to actively work through the tutorials with the instructors. As such, you will be writing live code in a .py file. This file must be emailed to bebi103 at caltech dot edu at the end of each lab session to get credit for attendance in the lab session. The subject line of the email should be "lastname firstname tutorial # yy/mm/dd", where the # sign is replaced by the lab session number.
Homework
Homework will be assigned roughly weekly. The homework assignments will typically have one or two short questions about the theory behind the analysis featured in the problem set, but will mainly consist of actually working up real (and in some rare cases contrived) data.
Data analysis is almost always a collaborative effort in both research and industry. Therefore, you will be assigned to groups of three (possibly with a couple groups of four depending on course enrollment). You will submit your homework as a group. The following homework policies apply.
- Each homework will have a defined due date and time, usually at the beginning of class. Late homeworks are accepted with a 10% deduction for every day they are late. For example, if the homework is due at 1pm on Monday, and you turn it in late but before 1pm on Tuesday, your maximum score will be 90%. You may not work on late homework during class.
- Each homework problem must be submitted as single Jupyter notebook and also as the notebook converted to HTML. Images from plots should be shown as high resolution PNGs. The file names must be group#_hw#_prob#.ipynb and group#_hw#_prob#.html, where the first # symbol is replaced by your assigned group number. The subsequent # symbols are replaced with the obvious values. The notebooks comprising the homework should be zipped into a single file named group#_hw#.zip. Your homework must be emailed to bebi103 at caltech dot edu ahead of the due time. The subject line of the email should be the same as the file name, minus the suffix.
- All code you wrote to do your assignment must be included in the notebook. Code from imported packages that you did not write (e.g., modules distributed by the class instructors) need not be displayed in the notebook. We will run the code in your notebook; all code must run to get credit.
- Since we are running your code to check it, you must have the following path structure for the data used in the homework. Within the directory of your Jupyter notebook, you should have a subdirectory called data. This directory contains any data files downloaded for the homework with the file names unaltered. If the data were distributed in a ZIP file, the file must have been unzipped in the data subdirectory and the file names and file names unaltered.
- All of your results must be clearly explained and all graphics clearly presented and embedded in the Jupyter notebook.
- Any mathematics in your homework must render clearly and properly with MathJax. This essentially means that your equations must be written in correct LaTeX.
- You should also include attribution in your homework submission: who in the group did what. While different people in the group may do different parts of the homework, I encourage you to work together on all parts of the homework. At the very least, you personally must understand all of the steps taken in the homework solutions and be able to repeat them by yourself.
- Where appropriate, you need to give detailed discussion of analysis choices you have made. As an example, you may choose to use wavelets to denoise temporal data instead of a Gaussian filter. You need to justify that choice.
- There is seldom a single right way to analyze a set of data. You are encouraged to try different approaches to analysis. If you perform an analysis and find problems with it, clearly write what the problems are and how they came about. Even if your analysis does not completely work, but you demonstrate that you thought carefully about it and understand its difficulties, you will get nearly full credit.
Grading
80% of your grade is determined from homework. Everyone in your group will get the same grade on the homework.
20% of your grade is determined from participation in the lab sessions. You are expected to work together with the course instructors and fellow students as we go through the tutorials with your full attention.
Collaboration policy and Honor Code
Most importantly, much of the data we will use in this course is unpublished, generously given to us by researchers both from Caltech and from other institutions. They have given us their data in good faith that it will be used only in this class. It is therefore imperative that you do not disseminate the data sets anywhere outside of this class.
Since the homework is done in assigned groups, you obviously should collaborate heavily with the other members of your group. You are free to discuss the homework with other groups, including via Piazza, but the work you submit must be the work of your own group.
You may not consult solutions of homework problems from previous editions of this course.
You are free to consult references, literature, websites, blogs, etc., outside of the materials presented in class (the obvious exceptions being last year's homework solutions). In fact, you are encouraged to do so. If you do, you must properly cite the sources in your homework. Be warned: doing homework by Google fishing will not work! The problems are too open ended and the techniques are too varied.
Excused absenses/extensions
Under certain circumstances, missed lab or lecture sessions will be excused and extensions given on the homework. The reasons for the excuses or extensions must be compelling, such as health or family issues. They must be requested from the course instructor.
Course communications
You are free to contact the course staff at any time, but we encourage you to use the class Piazza page for questions course topics and homework. Most of our mass communication with you will be through Piazza, so be sure to set your Piazza account to give you email alerts if necessary.