Multiple Regression/Correlation Lab
Multiple regression is a statistical procedure that elaborates on the correlation coefficient (r) which corresponds to the degree to which to continuous variables are related. Multiple regression, more specifically, pertains to the situation in which you are trying to predict scores on some continuous outcome variable from multiple continuous predictor variables. The analysis provides an assessment of (a) how well your set of predictor variables account for variability in the outcome variable, (b) whether the overall amount of variability accounted for in the outcome variable is significantly greater than would be expected by chance, (c) how much unique contribution each predictor variable makes toward explaining variability in the outcome variable, and (d) whether the amount of unique variability accounted for by one predictor variable is significantly greater than would be expected by chance.
General steps involved in conducting a multiple regression analysis include first examining the (zero-order; i.e., regular) intercorrelations among all variables and then conducting the multiple regression analysis. Consider the following example:
Peter and Paul are roommates. Peter is a health nut – in addition to eating well, he lifts at the gym several times a week, runs 30 miles each week, and goes on long hikes in the mountains several times a month. Peter believes that his exercise regimen helps keep him happy and that he would be miserable if he did not exercise.
Paul, on the other hand, is something of a sloth. He never exercises and claims that exercise if a futile and ridiculous endeavor. Further, Paul has a nasty drug problem. Paul believes that illegal drugs make him happy, and that he would be unhappy if he stopped taking them. One semester, Peter and Paul find themselves taking Research Methods together (with the same really cool professor!). They decide to settle their differences once and for all empirically. They design a study to address how related exercise and drug use are to overall happiness in life.
To conduct this research, they find 20 regular college students. Each participant reports (a) how many hours/week he or she exercises, (b) how many hours/week he or she is “high” on illegal drugs, and (c) how happy he or she is (on a 1 – 10 scale). You’ll find those data in the file “regress.sav.”
After collecting these data, Peter and Paul first compute zero-order correlations to examine the general intercorrelations among these three variables.
To compute such correlations, do the following:
1. Click on Analyze on toolbar.
2. Click on “correlate”
3. Click on “bivariate”
4. Variables would be “exercise,” “drugs,” and “happy.”
5. Click paste
6. Go to the .sps file and highlight the relevant commands.
7. Click on “run.”
8. Run selection.
Your syntax file will look about like so:
* zero-order correlations for regression lab
CORRELATIONS
/VARIABLES=exercise drugs happy
/PRINT=TWOTAIL NOSIG
/MISSING=PAIRWISE .
You will obtain output that looks like the following:
EXERCISE | DRUGS | HAPPY | ||
---|---|---|---|---|
EXERCISE | Pearson Correlation | 1.000 | -.922(**) | .855(**) |
Sig. (2-tailed) | . | .000 | .000 | |
N | 20 | 20 | 20 | |
DRUGS | Pearson Correlation | -.922(**) | 1.000 | -.795(**) |
Sig. (2-tailed) | .000 | . | .000 | |
N | 20 | 20 | 20 | |
HAPPY | Pearson Correlation | .855(**) | -.795(**) | 1.000 |
Sig. (2-tailed) | .000 | .000 | . | |
N | 20 | 20 | 20 | |
** Correlation is significant at the 0.01 level (2-tailed). |
Next, to examine the relationships between exercise and drugs with happiness concurrently, Peter and Paul go ahead and conduct the multiple regression. Here’s how:
1. Click on Analyze on the tool bar
2. click on Regression
3. Click on Linear
4. For Dependent variable choose Happy
5. For Independent variables choose Exercise and Drugs
6. Click on “Statistics”
7. Toggle on “part and partial correlations”
8. Click Continue
9. Click Paste
10. Go to the .sps file and highlight the relevant commands.
11. Click on “run.”
12. Run selection.
Your syntax file will look about like so:
* Regression for regression lab
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA ZPP
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT happy
/METHOD=ENTER exercise drugs
Here’s the output created by that syntax file:
Regression
Model | Variables Entered | Variables Removed | Method |
---|---|---|---|
1 | DRUGS, EXERCISE(a) | . | Enter |
a All requested variables entered. |
|||
b Dependent Variable: HAPPY |
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
---|---|---|---|---|
1 | .855(a) | .731 | .700 | 1.3498 |
a Predictors: (Constant), DRUGS, EXERCISE |
Model | Sum of Squares | df | Mean Square | F | Sig. | |
---|---|---|---|---|---|---|
1 | Regression | 84.228 | 2 | 42.114 | 23.115 | .000(a) |
Residual | 30.972 | 17 | 1.822 | |||
Total | 115.200 | 19 | ||||
a Predictors: (Constant), DRUGS, EXERCISE |
||||||
b Dependent Variable: HAPPY |
Unstandardized Coefficients | Standardized Coefficients | t | |||||||
---|---|---|---|---|---|---|---|---|---|
Model | B | Std. Error | Beta | ||||||
EXERCISE | .596 | .238 | .813 | 2.500 | |||||
DRUGS | -3.455E-02 | .246 | -.046 | -.140 | |||||
a Dependent Variable: HAPPY |
Sig. | Correlations | ||
---|---|---|---|
Zero-order | Partial | Part | |
.377 | |||
.023 | .855 | .518 | .314 |
.890 | -.795 | -.034 | -.018 |
Well, that’s it – Peter and Paul have conducted the analysis. Here’s how they
would write it up:
Results
To address whether exercise and drug use are related to dispositional happiness, zero-order correlations among these three variables were computed. Exercise was
positively related to happiness (r(20) = .86, p< .05) and negatively related to drug use (r(20) = -.92, p < .05). Further, drug use was negatively related to happiness (r(20) = -.80, p < .05). These correlations are summarized in Table 1.
________________________________________________
Table 1
Zero-Order Correlations among Exercise, Drug Use, and Happiness
Exercise Drug Use Happiness
Exercise —
Drug Use -.92* —
Happiness .86* -.80*
_______________________________________________
* p < .05
To examine the overall amount of variability in happiness explained by drug use and exercise, and to examine the unique amount of variability explained by both drug use and exercise on happiness, a multiple regression was next conducted. A significant amount of variability was accounted for by the set of exercise and drug use (R2= .73, F(2, 17) = 23.12, p< .05). Thus, approximately 73% of variability in happiness can be accounted for by information regarding participants’ exercise and drug use habits. Next, semi-squared partial correlations were computed to address the unique amount of variability in happiness accounted for, separately, by exercise and drug use. This information is summarized in Table 2. As can be seen in the table, exercise uniquely accounts for a significant amount of variability in happiness (sr2= .10, p < .05), whereas drug use does not account for a significant amount of variability in happiness (sr2= .00, ns). These results suggest that drug use has such a strong zero-order correlation with happiness because of its overlap with exercise (drug use and exercise are very strongly negatively correlated). After controlling for the overlapping variance between drug use and exercise, it seems that exercise is strongly and significantly predictive of happiness while drug use is not. Primarily, drug use seems to be negatively correlated with exercise – a fact that in effect makes drug use negatively correlated with happiness as well.
_________________________________________________
Table 2
Multiple Regression Predicting Happiness from Drug Use and
Exercise
Criterion Variable: Happiness
b B sr2
Predictor Variables
Exercise .60 .81 .10
Drug Use -.00 -.05 .00
R2 = .73*
____________________________________________________
* p < .05
(NOTE TO STUDENT: the semi-squared partial correlation (sr2) is something that you have to compute by hand. To get it, find the relevant number under “part” in the regression printout. Square that number. That is sr2, which corresponds to the percentage of variability in the dependent variable that is uniquely attributable to a particular predictor variable).
Discussion
The current research sheds light on the debate between Peter and Paul – more specifically, these data speak to the differential patterns of relationships between exercise and drug use with happiness. Clearly, exercise is positively related to happiness. This fact was manifest in both the zero-order correlations and the regression. Drug use is negatively related to both exercise and happiness. Drug use did not account for a significant amount of variability in happiness largely due to its overlap with exercise (as these variables were very strongly intercorrelated). In the end, these data imply you should do laps, do reps, and don’t do drugs! That guidance counselor from South Park, Mr. Mackey, was right when he said “Drugs are bad!”
_____________________________________________________
Assignment:
1. Get into groups of 3 or 4
2. come up with a study that will allow for a multiple regression analysis. Be sure to have one clear outcome (i.e., criterion, dependent) variable and at least two predictor variables. It would be nice if your study had some clear, meaningful rationale underlying it. Further, be sure that your variables are capable of being measured on a college campus easily and ethically within one hour.
3. collect data from at least 16 people on campus.
4. Enter the data into SPSS.
5. Run both the zero-order correlation and multiple regression analyses (as done in this lab).
6. Each student needs to INDEPENDENTLY write up a report summarizing (a) what the study was about, (b) how data were collected, (c) what the results were, and (d) what the results imply.
7. Enjoy!!!