Subject: summary of responses (Re: terminology for variables in regression)
From: Tim Hesterberg
Date: Mon, 02 Dec 2002 13:18:52 -0800
To: isostat@oberlin.edu

(oops; delete previous mail, this version adds one more reply)

Thank you all for your responses to my query on terminology for
variables in regression.  Below is a summary of the responses;
overwhelmingly in favor of
	response		instead of dependent
	predictor/explanatory	instead of independent
Two other terms were mentioned as alternatives to response:
	outcome (2 people), criterion (2 people)

Tim Hesterberg


From: David Moore <dsmoore@stat.purdue.edu>

As you note, all my books use ``explanatory variable'' and ``response
variable.''  The comment in IPS (similar in the others) says

	You will often see explanatory variables called {\bf
	independent variables} and response variables called {\bf
	dependent variables}.  The idea behind this language is that
	response variables depend on explanatory variables.  Because
	the words ``independent'' and ``dependent'' have other meanings
	in statistics that are unrelated to the explanatory-response
	distinction, we prefer to avoid those words.

My books are the best-sellers above the two-year-college text level.
Some of the better new books are written by people who have taught
from my texts and follow the same terminology:  Jessica Utts (Seeing
Through Statistics, Mind On Statistics) and Wild and Seber in their new
and good text (Chance Encounters: A First Course in Data Analysis and
Inference).  The Samuels/Witmer ``Statistics for the Life Sciences''
(my favorite biostat book) also uses explanatory/response.

And for what it's worth, the index entry for ``Dependent variable''
in Chambers/Hastie Statistical Models in S says ``See response.''

Both sets of terms are widely used and recognized.  Although change 
comes slowly, I think the clearer terms are spreading in texts,
especially texts that are ``modern'' in flavor and so ought to
appeal to Splus folk.


From: "Allan Rossman" <arossman@calpoly.edu>

I prefer the terminology that you suggest.  That's what Workshop Statistics
uses.


From: Cyndy Long <LONG_C@palmer.edu>

I often use "outcome" variable (typically teaching health professionals),
sometimes "response" variable. I prefer "explanatory" variable to
"predictor" variable. And, I certainly advocate for the elimination of the
terms "dependent" and "independent" in this context.


From: Johanna Hardin <Jo.Hardin@pomona.edu>

I totally agree.  I use explanatory and response instead of dependent /
independent.  The book "The Statistical Sleuth" uses the newer terms as do
Moore's books and Jessica Utts' books.  The terms are much cleaner and the
students understand them better.


From: "Steve C. Wang" <scwang@swarthmore.edu>

I'm fighting a battle here at Insightful (S-PLUS) to avoid the terminology
	dependent variable
	independent variables
for variables in a regression, because "dependent" and "independent"
have other meanings in statistics.  Note that "independent" variables
need not be independent (and almost never are).  To me this terminology
is needlessly confusing, particularly for students.

I dislike this terminology, for exactly the reasons you cite.


I'm pushing for
	response variable
	explanatory variables  or  predictor variables

What terminology do you use with your students?

I usually use "response variable" and "predictor". Or just Y and X.


What terminology does the books you use prefer (what books)?

I use Moore and McCabe, so response/explanatory.


From: Brian Jersky <jersky@SONOMA.EDU>

Most biostats books (eg Pagano and Gavreau) I've seen use predictor response,
as do I.


From: "Katherine Halvorsen" <Khalvors@email.smith.edu>

I'm using Moore and McCabe and response/predictor terminology.


From: "Douglas M. Andrews" <dandrews@wittenberg.edu>

What terminology do you use with your students?

Response/explanatory.

What terminology does the books you use prefer (what books)?

I use Moore's "Basic Practice of Statistics" and Rossman's "Workshop 
Statistics" -- two of the stat ed standards -- and they both use 
response/explanatory.

P.S.  Another reason to eschew the dep/indep lingo: Referring to Y as the 
"dependent" variable suggests that Y depends on X, which for many people 
implies a causal association, which of course need not be the case, even if 
there's a strong association.


From: Bharath <rbharath@colby.edu>

I use response or  predictand for Y and  predictor(s) or control 
variables for X's.

The book I used last semester, Kachigan: Statistical Analysis uses the
terminology of predictor variables and very explicitly discusses the
confusion caused by talking of " independent variables" and uses the
latter term only to refer to variables which do not covary.  For Y,
Kachigan uses the term "criterion variable".


From: "Christopher J. Lacke" <lacke@rowan.edu>

The following use "explanatory" and "response"

Ramsey and Schafer
Samuels and Witmer
Schork and Remington
Weiss
Peck, Olsen, and Devore

I also took a cursory glance at some regression books, where I saw both
terminologies used.


From: "Lachenbruch, Peter" <lachenbruch@cber.FDA.gov>

I prefer response and predictor.  For many years, I would get confused about
independent and dependent variables (maybe that's the wrong confession to
make...)


From: Karla Ballman <Ballman.Karla@mayo.edu>

Although I am not teaching undergraduates any more, I am 
still teaching stats courses through the Mayo Graduate school.
I agree with you about using the terms independent and dependent
variables. Both in my teaching and in my other work with clinicians,
we never use these terms (these just are not meaningful to the
MDs).

We use the terms response and outcome for Y and
explanatory and predictor for X. To be honest, I prefer the terms
outcome (Y) and explanatory (X). Actually, for the Y variable,
I really am indifferent between the terms response and outcome. I tend
to favor outcome since in our setting, that is generally what
the Y variable is measuring.

For the X variable I tend to avoid the use of predictor. The reason
is that many of our studies are observational and as such, we
can only hope to establish association. However, the MDs always
try to push the interpretations of the results more towards
causation. For some reason, the use of predictor to describe the X
variables makes them want to say that knowing X you can precict Y and
make the immediate leap to causation. They tend not to do this
as much when I refer to the Xs as explanatory.


From: Albyn Jones <jones@reed.edu>

I use "response" and "explanatory", and explicitly discourage
the use of "independent" and "dependent" for exactly the reason
you give.  

a quickie psudo random sample of texts within reach yields:

Weisberg "Applied Linear Regression" 2nd ed
   uses "response" and "predictor"

Hastie and Tibshirani (GAMs) 
   use "response" and "predictor"

Ramsey & Schafer ("the statistical sleuth") 
   use "response" and "explanatory"

McCullagh & Nelder (1st ed) often use "covariate" for `X'.

Venables and Ripley (MASS) tend to use "response" and "explanatory"
but not exclusively.  I found a passage where the text reads
"the response (dependent variable) is..."


From: jeff witmer <jeff.witmer@oberlin.edu>

The terminology I prefer is exactly what you are pushing for:

	response variable
	explanatory variables  or  predictor variables

This is what I use with my students and what appears in the book I use (but 
since I wrote the book, that should only count as one vote, not two!)


From: "Annette Gourgey" <statsense@rcn.com>

I come from educational measurement and we always used predictor and
criterion, to emphasize that there isn't necessarily causation.  I learned
from Pedhazur's Multiple Regression in Behavioral Research.  I use these
terms in my business stats classes even though our text (Statistics for
Managers by Levine et al.) uses independent and dependent, and explain why I
use them.