ASA Traveling Course:       

May 12, 2006
A Presentation by  Frank Harrell, Jr., Ph.D.


    All standard regression models have assumptions that must be verified for the model to have power to test hypotheses and for it to be able to predict accurately.   Of the principal assumptions (linearity, additivity, distributional), this course will emphasize methods for assessing and satisfying the first two.  Practical but powerful tools are presented for validating model assumptions and presenting model results.
    This course provides methods for estimating the shape of the relationship between predictors and response using the widely applicable method of augmenting the design matrix using restricted cubic splines.  Even when assumptions are satisfied, overfitting can ruin a model’s predictive ability for future observations.   Methods for data reduction will be introduced to deal with the common case where the number of potential predictors is large in comparison with the number of observations.
    Methods of model validation (bootstrap and cross–validation) will be covered, as will auxiliary topics such as modeling interaction surfaces, efficiently utilizing partial covariable data by using multiple imputation, variable selection, overly influential observations, collinearity, and shrinkage.   The methods covered will apply to almost any regression model, including ordinary least squares, logistic regression models, and survival models.


1. Hypothesis Testing vs. Estimation vs. Prediction
2. How Many Degrees of Freedom does a Data Mining Procedure Actually Have?
3. Regression Model Notation
4. Model Formulations
5. Interpreting Model Parameters
        (a) Nominal Predictors
        (b) Interactions
6. Relaxing Linearity Assumption for Continuous Predictors


A copy of the speaker's slides will be made available to the course participants.  The textbook is Regression Modeling Strategies, Springer-Verlag, New York, 2001 (list price $99.00 at;  sale price $77.70 for the Jan. 2006 hardcover printing).   It is not necessary to purchase the textbook to take the course.

    Dr Harrell, Professor of Biostatistics and department chair at the Vanderbilt University School of Medicine,  received his PhD in Biostatistics from the University of North Carolina in 1979.    He is managing editor of the journal Health Services and Outcomes Research Methodology, is an Associate Editor of Statistics in Medicine, is on the Editorial Board of American Heart Journal and Journal of Clinical Epidemiology, and is a consultant to FDA.
    As reflected in his 171 peer-reviewed publications, Dr Harrell has devoted his career to the study of patient outcomes in general and specifically to the development of accurate prognostic and diagnostic models and models for many other patient responses.   His primary methodologic research relates to development of reliable statistical models, quantifying predictive accuracy, modeling strategies utilizing data reduction methods, estimating covariable transformations, model validation methods, penalized estimation and missing data imputation.
    He has researched methods to estimate how continuous predictors relate to outcomes without assuming linearity, showing the advantages of piecewise cubic polynomials or spline functions.
    His new area of emphasis is pharmaceutical safety, related to developing better ways to present safety information to data monitoring committees and developing new methods for pharmaceutical researchers to explore clinical chemistry, hematology, adverse events, and ECG data in Phase II and III randomized clinical trials.





Karry Roberts giving Frank Harell, Jr. a Certificate of Appreciation from the Detroit Chapter


Lance Heilbrun giving Frank Harell, Jr. a copy of "The Detroit Almanac" from the Chapter