ADVANCED TOPICS IN BIOINFORMATICS

Advanced Topics in Bioinformatics

Course: BBS 741, Spring 2015

Instructors: Zhiping Weng, Thom Vreven

Syllabus: This Advanced Topics in Bioinformatics course covers key topics in modern bioinformatics and computational biology. The course is aimed not only at students specializing in bioinformatics, but also experimental students who would like to utilize bioinformatics tools in their daily research. The class starts with a primer on probability and statistics. It will then proceed to cover a broad range of machine learning techniques essential to modern bioinformatics, including linear regression, logistic regression, neural networks, random forests, support vector machines, Markov and hidden Markov models, and Bayesian networks. Topics at the intersection of biology and machine learning will also be covered, including a guest lecture by Prof. Elinor Karlsson on genome-wide association studies (GWAS). The course will include seventeen lectures and homework assignments. Reading and online teaching materials are assigned prior to each lecture. The homework is programming-based, and designed to both reinforce concepts discussed in lecture as well as introduce students to working with real biological data. Some experience with programming and statistics is desirable. All homework assignments will be primarily programmed in Python.

Lecture Sequence:

  • Basic concepts of probability
  • Discrete random variables
  • Continuous random variables
  • Hypothesis testing
  • Introduction to machine learning
  • Clustering and self-organizing maps
  • Principal component analysis
  • Linear regression and regularization
  • Genome-wide association studies
  • Logistic regression and regularization
  • Study design
  • Neural networks
  • Decision trees, bagging, and random forest
  • Support vector machines
  • Expectation maximization
  • Markov models
  • Hidden Markov models
  • Bayesian modeling
  • Bayesian networks
  • Introduction to probabilistic graphical models

 

Recommended Textbooks:

  • Pattern Recognition and Machine Learning by Christopher Bishop
  • An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
  • Learning From Data by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin