Course Information

Course: STAT:7400 Computer Intensive Statistics
Semester: Spring 2019
Lectures: MWF 11:30PM - 12:20PM
Room: Schaeffer 74
Instructor: Luke Tierney, Schaeffer 209, luke-tierney@uiowa.edu
Office Hours: 10:30 - 11:20 or by appointment

Outline

The goal of this course is to develop skills, knowledge, and tools useful in applying modern computationally intensive statistical methods to research in any field. Topics will be selected from random variate generation, design and analysis of simulation experiments, optimization algorithms for model fitting, bootstrap, Markov chain Monte Carlo, smoothing, machine learning and data mining, parallel computing, data technologies, and graphical methods. Most topics will be presented in the context of the R statistical computing language.

Prerequisites

The prerequisites for this course are STAT:5200 or BIOS:5610 and proficiency in Fortran or C or C++ or Java. These prerequisites imply a basic familiarity with mathematical statistics and with R.

Reading and Homework

Homework assignments consisting of a mix of computational and theoretical problems will be given roughly every week. Some problems will cover material not addressed in class and may require additional reading. Assignments will be posted on the class web site. Suggested reading will also be posted on the class web site when appropriate, but you should also seek out and explore relevant references on your own. Assignments will need to be submitted electronically. Many students find that these assignments take a long time to complete, so plan your time accordingly.

Class Project

Students registered for this class are expected to complete a class project. You can work on this project on your own or in a group of up to three students. Your project should represent about 20 hours of work on a topic of your choice that involves computation. You should start to think about the topic as soon as possible. You might investigate properties of a methodology you find interesting, you might compare several methods on a variety of problems, or you might analyze an interestign data set using methodology related to ideas introduced in the class. There are many possible choices for the topic of your project, and identifying a suitable topic is an important part of your task. The project should represent new work, not something you have done for another course or as part of your thesis.

A proposal for your project is due on Monday, March 25. The proposal should be at most two pages long. A final report on your project is due on Friday, May 3. The report should be three to five pages in length, excluding any appendices you wish to attach, and must be submitted electronically. Your project may be shared with the class through the class web page.

Grading

The course grade will be based on assignments and the class project. You may discuss general issues and approaches with your fellow students, but your work must be your own. If you use any references, including solutions to similar problems prepared by other students, you must cite and credit your sources.

EMail and World Wide Web

Announcements on changes or clarifications of assignments or other matters may be sent by email to your university email account or posted on the class web page. You should check the class home page and your email regularly.

Recommended Texts

 Geof H. Givens, Jennifer A. Hoeting (2005). Computational Statistics, Wiley-Interscience.
 Norman Matloff (2011). The Art of R Programming: A Tour of Statistical Software Design, No Starch Press.
 John Monahan (2011). Numerical Methods of Statistics, 2nd Edition, Cambridge University Press.

Accommodation for Disabilities

I would like to hear from anyone who has a disability which may require seating modifications or testing accommodations or accommodations of other class requirements, so that appropriate arrangements may be made. Please contact me during my office hours.

Other References

 John Chambers (2008) Software for Data Analysis: Programming with R, Springer-Verlag.
 Dani Gamerman and Hedibert Lopes (2006). Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 2nd edition, CRC Press.
 James E. Gentle (2002). Elements of Computational Statistics, Springer Verlag.
 James E. Gentle (2003). Random Number Generation and Monte Carlo Methods, 2nd edition, Springer Verlag.
 T. Hastie, R. Tibshirani, J. H. Friedman (2009). The Elements of Statistical Learning, 2nd Edition, Springer Verlag.
 Wolfgang Hörmann, Josef Leydold, and Gerhard Derflinger (2003). Automatic Nonuniform Random Variate Generation, Springer Verlag.
 Kenneth Lange (1999) Numerical Analysis for Statisticians, Springer Verlag.
 Paul Murrell (2011). R Graphics, 2nd Edition, Chapman & Hall/CRC.
 Paul Murrell (2009). Introduction to Data Technologies, Chapman & Hall/CRC.
 Brian D. Ripley (1987). Stochastic Simulation John Wiley & Sons.
 Brian D. Ripley (1996). Pattern Recognition and Neural Networks Cambridge University Press.
 Christian P. Robert, George Casella (2010). Monte Carlo Statistical Methods, 2nd edition, Springer Verlag.
 William N. Venables, Brian D. Ripley (2000). S Programming, Springer Verlag.
 William N. Venables and Brian D. Ripley (2002). Modern Applied Statistics with S, 4th edition, Springer Verlag.



Subsections