# Whamit!

The Weekly Newsletter of MIT Linguistics

## This Week: IAP Class on Data Analysis

Mike Frank and Ed Vul
Statistics and Visualization for Data Analysis and Inference

Mon Jan 26 thru Fri Jan 30, 1:00 – 3:00PM
Room 46-3189 M-R 1/26 – 1/29 and 46-3310 for F 1/30
http://stellar.mit.edu/S/course/9/ia09/9.savfdaai/index.html

Description:

A whirl-wind tour of the statistics used in behavioral science research, covering topics including: data visualization, building your own null-hypothesis distribution through permutation, useful parametric distributions, the generalized linear model, and model- based analyses more generally. Familiarity with Matlab, Octave, or R will be useful, prior experience with statistics will be helpful but is not essential. This course is intended to be a ground-up sketch of a coherent, alternative perspective to the “null-hypothesis significance testing” method for behavioral research (but don’t worry if you don’t know what this means).

Course Outline:

Visualization. Creating a visualization to understand experimental results. Simple univariate displays. Conventional multivariate displays. The repertoire of visual variables. Introduction of examples to be used throughout the course: simple behavioral experiments, complex behavioral experiments, and eye-tracking.

Permutation. Understanding what would have happened “by chance” through non-parametric tests, confidence bounds, and measures of effect size. Discussion of null-hypothesis significance testing and its limitations.

Distribution. Understanding the spread of data. Inferring parametric forms (Binomial, Gaussian, Poisson, etc.) as a convenient way of describing the structure of data. Effect size and Bayesian derivation of tests for parametric distributions, inc. binomial test, t-test, Cohen’s d, etc.

Models of Data 1: The Linear Model. What is a “model of data.” Basic assumptions of the linear model. The standard and generalized linear model and relationship to ANOVA. Bayesian derivation of the LM. Link functions and logistic regression. Effect size in a linear model. Introduction to multilevel models.

Models of Data 2: Bayesian Models. Constructing and testing more complex models of data. Bayesian models as a tool for creating models with complex task assumptions. Brief introduction to basic techniques for Bayesian inference.