“REGRESSION MODELING STRATEGIES”
May 12, 2006
A Presentation by Frank Harrell, Jr., Ph.D.
All standard regression models have assumptions that
must be verified for the model to have power to test hypotheses and for
it to be able to predict accurately. Of the principal
assumptions (linearity, additivity, distributional), this course will
emphasize methods for assessing and satisfying the first two.
Practical but powerful tools are presented for validating model
assumptions and presenting model results.
This course provides methods for estimating the
shape of the relationship between predictors and response using the
widely applicable method of augmenting the design matrix using
restricted cubic splines. Even when assumptions are satisfied,
overfitting can ruin a model’s predictive ability for future
observations. Methods for data reduction will be introduced
to deal with the common case where the number of potential predictors
is large in comparison with the number of observations.
Methods of model validation (bootstrap and
cross–validation) will be covered, as will auxiliary topics such as
modeling interaction surfaces, efficiently utilizing partial covariable
data by using multiple imputation, variable selection, overly
influential observations, collinearity, and shrinkage. The
methods covered will apply to almost any regression model, including
ordinary least squares, logistic regression models, and survival
1. Hypothesis Testing vs. Estimation vs. Prediction
2. How Many Degrees of Freedom does a Data Mining Procedure Actually
3. Regression Model Notation
4. Model Formulations
5. Interpreting Model Parameters
(a) Nominal Predictors
6. Relaxing Linearity Assumption for Continuous Predictors
A copy of the speaker's slides will be made available to the course
participants. The textbook is Regression Modeling Strategies,
Springer-Verlag, New York, 2001 (list price $99.00 at
www.amazon.com; sale price $77.70 for the Jan. 2006 hardcover
printing). It is not necessary to purchase the textbook to
take the course.
INFORMATION ABOUT THE SPEAKER
Dr Harrell, Professor of Biostatistics and
department chair at the Vanderbilt University School of Medicine,
received his PhD in Biostatistics from the University of North Carolina
in 1979. He is managing editor of the journal Health
Services and Outcomes Research Methodology, is an Associate Editor of
Statistics in Medicine, is on the Editorial Board of American Heart
Journal and Journal of Clinical Epidemiology, and is a consultant to
As reflected in his 171 peer-reviewed publications,
Dr Harrell has devoted his career to the study of patient outcomes in
general and specifically to the development of accurate prognostic and
diagnostic models and models for many other patient
responses. His primary methodologic research relates to
development of reliable statistical models, quantifying predictive
accuracy, modeling strategies utilizing data reduction methods,
estimating covariable transformations, model validation methods,
penalized estimation and missing data imputation.
He has researched methods to estimate how continuous
predictors relate to outcomes without assuming linearity, showing the
advantages of piecewise cubic polynomials or spline functions.
His new area of emphasis is pharmaceutical safety,
related to developing better ways to present safety information to data
monitoring committees and developing new methods for pharmaceutical
researchers to explore clinical chemistry, hematology, adverse events,
and ECG data in Phase II and III randomized clinical trials.
Karry Roberts giving
Frank Harell, Jr.
a Certificate of Appreciation from the Detroit Chapter
giving Frank Harell, Jr.
a copy of "The Detroit Almanac" from the Chapter