This essay has been submitted by a student. This is not an example of the work written by professional essay writers.
Uncategorized

Review Correlation and Regression

This essay is written by:

Louis PHD Verified writer

Finished papers: 5822

4.75

Proficient in:

Psychology, English, Economics, Sociology, Management, and Nursing

You can get writing help to write an essay on these topics
100% plagiarism-free

Hire This Writer

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Review Correlation and Regression

Name

Institution

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Exploratory data analysis
  2. Exploratory data analysis on variables

Student Agreeableness/Lecture Agreeableness

Student Extroversion/Lecture Extroversion

 

 

Student Agreeableness/Lecture Extroversion

 

Student Extroversion/Lecture Agreeableness

  1. Give a one to two paragraphs, write up of the data once you have done this.

The analysis of the relationship between lecture and student agreeableness and Extroversion shows that there is a broader data distribution with a weaker relationship between the variables that have been considered in the study. Student Agreeableness and Lecture Agreeableness relationship is weak with the presence of outliers that are visible in the dataset with r2 = 0.03. The scatterplot of Student Extroversion and Lecture Extroversion shows a wider distribution of the data with the relationship explained by r2 = 0.023, showing that the relationship is weak. The Student Agreeableness and Lecture Extroversion also show that the distribution of is broad data distribution with the presence of outliers that are visible in the scatterplot. The relationship between the variables is weak, as explained by r2 = 0.002. The scatterplot between Student Extroversion and Lecture Agreeableness shows that there is no linear relationship between the variables, as explained by r2 = 1.8e-05.

  1. Create an APA style table that presents descriptive statistics for the sample.

 

Descriptive Statistics
NMinimumMaximumMeanStd. Deviation
Student Extroversion4185.0046.0030.10296.31897
Student Agreeableness41325.0073.0046.51577.45295
Student wants Extroversion in lecturers283-6.0028.0012.95766.94494
Student wants Agreeableness in lecturers417-21.0029.008.88259.57577
Valid N (listwise)271

 

  1. Make a decision about the missing data. How are you going to handle it, and why?

The missing data obtained from the dataset will be excluded from the analysis. This is to ensure that there is consistency in the data considered in making decisions.  Missing values are likely to have a negative influence on the findings, especially when there are many missing values. Thus, excluding them from the analysis is the most efficient way of maintaining the reliability of the findings.

  1. Correlation

 

Correlations
Student ExtroversionStudent AgreeablenessStudent wants Extroversion in lecturersStudent wants Agreeableness in lecturers
Student ExtroversionPearson Correlation1.080.153*.004
Sig. (2-tailed).106.010.932
N418406281411
Student AgreeablenessPearson Correlation.0801.050.164**
Sig. (2-tailed).106.412.001
N406413276405
Student wants Extroversion in lecturersPearson Correlation.153*.0501.118*
Sig. (2-tailed).010.412.049
N281276283280
Student wants Agreeableness in lecturersPearson Correlation.004.164**.118*1
Sig. (2-tailed).932.001.049
N411405280417
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).

 

The test that has been conducted is two-tailed because there is a need to determine whether there is a zero correlation between the variables included in the analysis.

Results interpretation

A Pearson r correlation was conducted to determine whether there was a statistically significant correlation between the variables. The findings from the analysis showed that there was a weak positive, statistically significant correlation between student agreeableness and lecture agreeableness, (r = 0.164, p= 0.001). There was also a weak positive, a statistically significant correlation between student extroversion and lecture Extroversion, (r = 0.153, p= 0.01).

 

 

 

 

  1. Regression
Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the Estimate
1.153a.023.0206.82989
a. Predictors: (Constant), Student Extroversion

 

ANOVAa
ModelSum of SquaresdfMean SquareFSig.
1Regression311.9471311.9476.687.010b
Residual13014.63027946.647
Total13326.577280
a. Dependent Variable: Student wants Extroversion in lecturers
b. Predictors: (Constant), Student Extroversion

 

Coefficientsa
ModelUnstandardized CoefficientsStandardized CoefficientstSig.
BStd. ErrorBeta
1(Constant)8.2201.8664.405.000
Student Extroversion.160.062.1532.586.010
a. Dependent Variable: Student wants Extroversion in lecturers

 

Two-tailed or one-tailed

The test that conducted was a two-tailed test because it sought to determine whether the student’s extroversion score would predict if a student wants a lecturer to be extroverted. Thus two-tailed tests present a favorable focus on the outcomes.

The assumptions

Conducting a regression analysis incorporates certain assumptions that must be met. The basic assumption is that the dependent variable must be measured on a continuous level. The lecturer extroversion score is a continuous variable measured on a ratio scale hence fulfilling the variable. The independent variables must be continuous or categorical. The independent variable student’s extroversion score is a continuous variable.

Results

A regression analysis was conducted to examine whether or not the student wants a lecturer to be extroverted can be predicted using the student’s extroversion score. The findings showed that student’s extroversion score could be effectively used in predicting lecture extroversion at 95% confidence level (p = 0.01, p<0.05). The analysis also revealed that 23% of lecturer extroversion could be explained by student’s extroversion score.

The results are similar to the correlation analysis results, which found that there was a weak positive correlation between a student’s extroversion score and lecturer extroversion score.

  1. Multiple regression analysis
Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the Estimate
1.168a.028.0186.82934
a. Predictors: (Constant), Gender, Student Extroversion, Age

 

ANOVAa
ModelSum of SquaresdfMean SquareFSig.
1Regression373.9523124.6512.673.048b
Residual12872.61527646.640
Total13246.568279
a. Dependent Variable: Student wants Extroversion in lecturers
b. Predictors: (Constant), Gender, Student Extroversion, Age

 

 

Coefficientsa
ModelUnstandardized CoefficientsStandardized CoefficientstSig.
BStd. ErrorBeta
1(Constant)7.5602.8442.658.008
Student Extroversion.161.062.1552.607.010
Age.019.109.010.169.866
Gender1.036.927.0661.118.265
a. Dependent Variable: Student wants Extroversion in lecturers

A two-tailed or one-tailed test

The regression analysis test utilized a two-tailed test.  The analysis sought to determine whether age, gender, and student extroversion score predict lecturer’s extroversion.

Assumptions

The additional assumptions when conducting multiple regression analysis is that the independent variables must be two or more. The independent variables that have been included in the analysis are three.  It also assumes that the independent variables are not highly correlated with each other.

Results

A multiple regression analysis was conducted to determine whether age, gender, and student’s extroversion score predict lecturer extroversion. The results found that only student’s extroversion score could be effectively used in predicting lecturer extroversion at 95% confidence level (p = 0.01, p<0.05).

From the correlation above, student’s extroversion score was significantly correlated to the lecturer extroversion. The multiple regression conducted in this case has shown that a student’s Extroversion predicts lecturer extroversion.

Part B. Applying Analytical Strategies to an Area of Research Interest. 

    1. Briefly restate your research area of interest.

The number of confirmed Corona Virus cases continues to increase significantly in the recent past. There are different projects that have been modelled, presenting the expected cases of Coronavirus at different times. Therefore the variables that will be assessed in this case will include the actual number of cases and the projected cases. The variables included in this case are the projected positive cases and the real positive cases.

  1. Pearson correlation
Correlations
ActualProjected
ActualPearson Correlation1.986**
Sig. (2-tailed).000
N2020
ProjectedPearson Correlation.986**1
Sig. (2-tailed).000
N2020
**. Correlation is significant at the 0.01 level (2-tailed).

 

A Pearson correlation analysis was conducted to determine whether or not there was a relationship between the actual and projected coronavirus cases. The results found that there was a strong positive relationship between actual and projected coronavirus data( r = 0.986, p = 0.000, p<0.05). The coefficient of determination (r2) = 0.972. Thus the projected data explains 97.2% of the actual data.

 

 

 

 

  1. Spearman correlation

 

Correlations
ActualProjected
Spearman’s rhoActualCorrelation Coefficient1.0001.000**
Sig. (2-tailed)..
N2020
ProjectedCorrelation Coefficient1.000**1.000
Sig. (2-tailed)..
N2020
**. Correlation is significant at the 0.01 level (2-tailed).

 

A Spearman rank-order correlation analysis was conducted to determine whether or not there was a relationship between the actual and projected coronavirus cases. The results found that there was a strong positive relationship between actual and projected corona virus data (rs(20) = 1, p <0.05). The coefficient of determination (r2) = 1. Thus the projected data explains 100% of the actual data.

  1. Partial Correlation vs Semi-Partial Correlation.

Partial correlation

The variables that are assessed in this case include actual data, projected data, and the average age of those found positive with Corona Virus. The three variables are continuous variables that are measured on a ratio scale.

 

Correlations
Control VariablesActualProjectedAvg_age
-none-aActualCorrelation1.000.986.467
Significance (2-tailed)..000.038
df01818
ProjectedCorrelation.9861.000.499
Significance (2-tailed).000..025
df18018
Avg_ageCorrelation.467.4991.000
Significance (2-tailed).038.025.
df18180
Avg_ageActualCorrelation1.000.982
Significance (2-tailed)..000
df017
ProjectedCorrelation.9821.000
Significance (2-tailed).000.
df170
a. Cells contain zero-order (Pearson) correlations.

 

A partial correlation was run to determine the relationship between actual data and projected data while controlling for the average age. The results showed that there was a strong positive partial correlation between actual data and projected data r(17) = 0.982, p = 0.000, p<0.05.

Semi partial correlation

 

Coefficientsa
ModelUnstandardized CoefficientsStandardized CoefficientstSig.Correlations
BStd. ErrorBetaZero-orderPartialPart
1(Constant)40.61648.319.841.412
Projected.761.0351.00221.512.000.986.982.869
Avg_age-.562.785-.033-.717.483.467-.171-.029
a. Dependent Variable: Actual

 

A semi partial correlation analysis was conducted to determine the relationship between actual and projected data while controlling for age. The results showed that there was a significant correlation between actual and age while controlling for age r (20) = 0.869, p = 000, p<0.05).

Comparing the two correlation analysis, the results are related, although partial correlation presents a better understanding of the relationship between the variables that were being assessed.

  1. Simple regression analysis

The variables that would be considered in calculating regression analysis is the actual cases and projected cases. The outcome variable is actual cases, while the predictor variable is the projected cases. These variables have been measured on a ratio scale. This is because the analysis will focus on assessing whether the actual cases can be predicted by the projected cases.

Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the Estimate
1.986a.971.9709.517
a. Predictors: (Constant), Projected
ANOVAa
ModelSum of SquaresdfMean SquareFSig.
1Regression55454.702155454.702612.290.000b
Residual1630.2481890.569
Total57084.95019
a. Dependent Variable: Actual
b. Predictors: (Constant), Projected

 

Coefficientsa
ModelUnstandardized CoefficientsStandardized CoefficientstSig.
BStd. ErrorBeta
1(Constant)6.0633.1921.899.074
Projected.749.030.98624.744.000
a. Dependent Variable: Actual

 

The model summary shows that the coefficient of determination (r2) = 0.97. This shows that projected cases can explain 97% of the actual cases. There is a strong relationship between actual and projected cases. The analysis also found that projected cases are a significant predictor of actual coronavirus cases.

 

  1. Multiple regression

The variables that would be considered in calculating multiple regression include actual cases, projected cases and the average age of the patients. All three variables are continuous variables measured on a ratio scale. A Bivariate linear regression method will be used in this case based on the Enter method. This because there are two predictor variables which are evaluated on a linear context.

 

Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the Estimate
1.986a.972.9699.648
a. Predictors: (Constant), Avg_age, Projected

 

 

ANOVAa
ModelSum of SquaresdfMean SquareFSig.
1Regression55502.517227751.258298.130.000b
Residual1582.4331793.084
Total57084.95019
a. Dependent Variable: Actual
b. Predictors: (Constant), Avg_age, Projected

 

 

Coefficientsa
ModelUnstandardized CoefficientsStandardized CoefficientstSig.
BStd. ErrorBeta
1(Constant)40.61648.319.841.412
Projected.761.0351.00221.512.000
Avg_age-.562.785-.033-.717.483
a. Dependent Variable: Actual

 

The model summary shows that the coefficient of determination (r2) = 0.972. This shows that projected cases and average age predicts 97% of the actual cases. There is a strong relationship between actual and projected cases. The analysis also found that only projected cases are a significant predictor of actual coronavirus cases.

  1. Logistic regression

The three variables that have been included in the analysis are actual positive cases, average age and the setting.  Actual cases and average age are predictor variables that are continuous and measured on a ratio scale, while the setting is a categorical variable that is measured on a nominal scale. Binary logistic regression  Enter method will be used, considering that the outcome variable has only two groups.

Variables in the Equation
BS.E.WalddfSig.Exp(B)
Step 1aProjected-.004.008.2821.595.996
Avg_age-.076.168.2061.650.927
Constant4.92210.344.2261.634137.209
a. Variable(s) entered on step 1: Projected, Avg_age.

The analysis shows that an increase in projected cases is likely to have a 0.996 chance occurrence in an urban area than rural area while the increase in age is 0.927 times likely to be in an urban area than in rural setting.

 

 

 

 

 

 

 

 

  Remember! This is just a sample.

Save time and get your custom paper from our expert writers

 Get started in just 3 minutes
 Sit back relax and leave the writing to us
 Sources and citations are provided
 100% Plagiarism free
error: Content is protected !!
×
Hi, my name is Jenn 👋

In case you can’t find a sample example, our professional writers are ready to help you with writing your own paper. All you need to do is fill out a short form and submit an order

Check Out the Form
Need Help?
Dont be shy to ask