Stat 850: Computing Tools for Statisticians

Introductions to statistical computing packages and document preparation software. Topics include: graphical techniques, data management, Monte Carlo simulation, dynamic document preparation, presentation software.

Goals

  1. Learn how to use R and/or Python for data analysis, data processing, data visualization, and statistical simulation.
  2. Become familiar with the process, techniques, and goals of exploratory data analysis.
  3. Create, assess, and debug code effectively.
    1. Use online resources to find software to perform a task, comparing approaches taken by competing programs.
    2. Read error messages, find related problems in online forums, and isolate the conditions necessary to generate the error.
    3. Generate minimum working examples or reproducible examples of errors in order to ask for help effectively.
  4. Communicate statistical results using reproducible, dynamic tools. Understand the importance of reproducibility in scientific computation.

Objectives

(what you should be able to do at the end of this course)

  1. Clean and format the data appropriately for the intended analysis or visualization method. (Goals: 1)

  2. Explore a data set using numerical and visual summaries, developing questions which can be answered using statistics. (Goals: 1, 2)

  3. Evaluate methods or software to assess relevance to a problem. Compare similar options to determine which are more appropriate for a given application (Goals: 1, 3)

  4. Test and debug software, using the following sequence: (Goals: 3, 4)

    1. Reproduce the error in a new environment,
    2. Create a minimal reproducible example,
    3. Research the error message and evaluate online resources for relevance,
    4. Ask for help, describing the error or problem appropriately.
  5. Document the data, methods, and results of an analysis using reproducible methods. (Goals: 1, 2, 4)

  6. Construct a reproducible statistical simulation for a given problem using methods such as MCMC, inverse probability sampling, and rejection sampling. (Goals: 1, 3, 4)