CLICK HERE TO DOWNLOAD PPT ON Correlation And Regression Analysis
Correlation And Regression Analysis Presentation Transcript
1.Correlation & Regression Analysis
2.Introduction
Regression & Correlation Analysis show us how to determine both the nature & the strength of relationship between two variables.
Regression & Correlation Analysis are based on the statistical association , between two (or more) variables, and helps to predict one variable if other is known.
The known variable is called the independent variable. The variable we are trying to predict is the dependent variable.
Regression & Correlation Analysis show us how to determine both the nature & the strength of relationship between two variables.
Regression & Correlation Analysis are based on the statistical association , between two (or more) variables, and helps to predict one variable if other is known.
The known variable is called the independent variable. The variable we are trying to predict is the dependent variable.
3. The first step in identifying whether any relationship exist between a set of values is to observe its graphical representation to observe any pattern.
Any distribution in which each individual or unit of the set is made up of two values is called a bivariate distribution.
Any bivariate data can be represented through various charts & diagrams.
One of the method to represent these data is a Scatter Diagram showing the distribution of the two variables over the plane.
Any distribution in which each individual or unit of the set is made up of two values is called a bivariate distribution.
Any bivariate data can be represented through various charts & diagrams.
One of the method to represent these data is a Scatter Diagram showing the distribution of the two variables over the plane.
4.Scatter Diagram
Scatter diagram is a graphical method to display the relationship between two variables plotting pairs of bivariate observations (x, y) on the X-Y plane.
X is called an independent variable Y is called the dependent variable
Scatter diagrams are important for initial exploration of the relationship between two quantitative variables.
Eg. Suppose a university wants to know whether any relationship exists between students scores on an entrance examination and that student’s CGPA upon graduation. We can make a Scatter diagram for the data.
5.Correlation
Correlation analysis is used to measure the statistical significance of the association(linear relationship) between two variables.
Correlation measures to what extent two (or more) variables are related.
In a Scatter diagram,
If the points are very close to each other, a fairly good amount of correlation can be expected between the two variables.
If they are widely scattered, a poor correlation can be expected between them.
Scatter diagram is a graphical method to display the relationship between two variables plotting pairs of bivariate observations (x, y) on the X-Y plane.
X is called an independent variable Y is called the dependent variable
Scatter diagrams are important for initial exploration of the relationship between two quantitative variables.
Eg. Suppose a university wants to know whether any relationship exists between students scores on an entrance examination and that student’s CGPA upon graduation. We can make a Scatter diagram for the data.
5.Correlation
Correlation analysis is used to measure the statistical significance of the association(linear relationship) between two variables.
Correlation measures to what extent two (or more) variables are related.
In a Scatter diagram,
If the points are very close to each other, a fairly good amount of correlation can be expected between the two variables.
If they are widely scattered, a poor correlation can be expected between them.
6.Direction of Correlation
7.Note:
If there is an upward trend rising from the lower left hand corner
and going upward to the upper right hand corner, the correlation
obtained from the graph is said to be positive. Also, if there is a
downward trend from the upper left hand corner the correlation
obtained is said to be negative.
Positive correlation indicates that the two variables move in the same direction
Negative correlation indicates that they move in opposite directions
If the points are scattered and they reveal no upward or downward
trend then we say the variables are uncorrelated.
8. The degree of relationship existing between three or more variables is called multiple correlation.
The fundamental principles involved in problems of it are analogous to those of simple correlation.
Just as there exist least squares regression line approximating a set of N data points (X,Y) in a two dimensional scatter diagram,
If there is an upward trend rising from the lower left hand corner
and going upward to the upper right hand corner, the correlation
obtained from the graph is said to be positive. Also, if there is a
downward trend from the upper left hand corner the correlation
obtained is said to be negative.
Positive correlation indicates that the two variables move in the same direction
Negative correlation indicates that they move in opposite directions
If the points are scattered and they reveal no upward or downward
trend then we say the variables are uncorrelated.
8. The degree of relationship existing between three or more variables is called multiple correlation.
The fundamental principles involved in problems of it are analogous to those of simple correlation.
Just as there exist least squares regression line approximating a set of N data points (X,Y) in a two dimensional scatter diagram,
9.Coefficient of multiple correlation
10.Multiple correlation problem
The table below shows the weights X1 to the nearest Kilogram, the heights X2 to the nearest inch and ages X3 to the nearest year of boys.
Find the equation defining relationship of x1 on x2 and x3.
Estimate the weight of the boy who is 9 years old and 54 in tall.
The table below shows the weights X1 to the nearest Kilogram, the heights X2 to the nearest inch and ages X3 to the nearest year of boys.
Find the equation defining relationship of x1 on x2 and x3.
Estimate the weight of the boy who is 9 years old and 54 in tall.
11.Rank Correlation
12.Example: The data given below are from student records of a University showing the CGPA & GRE Scores of students.
13.Important Points Regarding ‘r’
r takes values between -1 and +1.
r =0 represents no linear relationship between the two variables
r >0 implies a direct linear relationship.
r <0 implies an inverse linear relationship.
Larger the value of modulus of r stronger the relation between dependent variable and independent variable.
Modulus of r should be at least greater than 0.6 for considerable relation.
If the variables are independent then the correlation is 0, but the converse is not true because the correlation coefficient detects only linear dependencies between two variables.
correlation does not necessarily demonstrate a causal relationship. A significant correlation only shows that two factors vary in a related way (positively or negatively).
r takes values between -1 and +1.
r =0 represents no linear relationship between the two variables
r >0 implies a direct linear relationship.
r <0 implies an inverse linear relationship.
Larger the value of modulus of r stronger the relation between dependent variable and independent variable.
Modulus of r should be at least greater than 0.6 for considerable relation.
If the variables are independent then the correlation is 0, but the converse is not true because the correlation coefficient detects only linear dependencies between two variables.
correlation does not necessarily demonstrate a causal relationship. A significant correlation only shows that two factors vary in a related way (positively or negatively).
14.Introduction to Regression Analysis
Regression analysis is the mathematical process of using observations to find the line of best fit through the data in order to make estimates and predictions about the behavior of the variables.
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at least one independent variable
Explain the impact of changes in an independent variable on the dependent variable
Dependent variable: the variable we wish to explain.
Independent variable: the known variable used to explain the dependent variable.
Regression analysis is the mathematical process of using observations to find the line of best fit through the data in order to make estimates and predictions about the behavior of the variables.
Regression analysis is used to:
Predict the value of a dependent variable based on the value of at least one independent variable
Explain the impact of changes in an independent variable on the dependent variable
Dependent variable: the variable we wish to explain.
Independent variable: the known variable used to explain the dependent variable.
15.Linear Regression
If we try to fit a line to a correlated set of data - producing an equation that shows the relationship, so that we might predict the dependent variable knowing the independent variable. The method for this is called linear regression.
Suppose we have a sample of size ‘n’ and it has two sets of measures, denoted by x and y. We can predict the values of ‘y’ given the values of ‘x’ by using the Regression Equation.
If we try to fit a line to a correlated set of data - producing an equation that shows the relationship, so that we might predict the dependent variable knowing the independent variable. The method for this is called linear regression.
Suppose we have a sample of size ‘n’ and it has two sets of measures, denoted by x and y. We can predict the values of ‘y’ given the values of ‘x’ by using the Regression Equation.
16.Example: Scores made by students in a mathematics class in the mid -term and final examination are given here. Develop a regression equation which may be used to predict final examination scores from the mid – term score.
17.Some Applications
Trend analysis for assessing future sales, fluctuations in turnover and forecasting in general.
Economic theory & business studies relationship between variables like price & quantity demand.
Work measurement computer modelling (to set times for jobs, tasks and whole projects)
Validating the data collected by a method of sampling a situation with two or more independent variables.
To test hypotheses about cause-and-effect relationships. In this, the experimenter determines the values of the X-variable and sees whether variation in X causes variation in Y.
Trend analysis for assessing future sales, fluctuations in turnover and forecasting in general.
Economic theory & business studies relationship between variables like price & quantity demand.
Work measurement computer modelling (to set times for jobs, tasks and whole projects)
Validating the data collected by a method of sampling a situation with two or more independent variables.
To test hypotheses about cause-and-effect relationships. In this, the experimenter determines the values of the X-variable and sees whether variation in X causes variation in Y.
18.Example: Use of Regression Analysis in Demand Forecasting
Suppose we are given the data for the demand of a product in last 10 years.
Suppose we are given the data for the demand of a product in last 10 years.
19.Statistical Packages
Correlation & Regression Analysis could be easily done using the various
software packages like,
MS Excel
MATLAB
MiniTab
SAS
SPSS
MS Excel provide ‘Data Analysis’ Tool for performing Correlation &
Regression analysis.
20.Conclusion: Correlation and Regression
Correlation analysis is the process of finding how well (or badly) the line fits the observations, such that if all the observations lie exactly on the line of best fit, the correlation is considered to be 1 or unity. Correlation Coefficient, r, measures the strength of bivariate association.
Regression analysis is the mathematical process of using observations to find the line of best fit through the data in order to make estimates and predictions about the behaviour of the variables. This line of best fit may be linear (straight) or curvilinear to some mathematical formula. The Regression line is a prediction equation that estimates the values of y for any given x.
Correlation & Regression Analysis could be easily done using the various
software packages like,
MS Excel
MATLAB
MiniTab
SAS
SPSS
MS Excel provide ‘Data Analysis’ Tool for performing Correlation &
Regression analysis.
20.Conclusion: Correlation and Regression
Correlation analysis is the process of finding how well (or badly) the line fits the observations, such that if all the observations lie exactly on the line of best fit, the correlation is considered to be 1 or unity. Correlation Coefficient, r, measures the strength of bivariate association.
Regression analysis is the mathematical process of using observations to find the line of best fit through the data in order to make estimates and predictions about the behaviour of the variables. This line of best fit may be linear (straight) or curvilinear to some mathematical formula. The Regression line is a prediction equation that estimates the values of y for any given x.
21. Reference
Levin, I. R., Rubin, D. S., Statistics for Management, Prentice Hall Of India, New Delhi, 2006.
Panneerselvam, R., Research Methodology, Prentice Hall Of India, New Delhi, 2004.
Levin, I. R., Rubin, D. S., Statistics for Management, Prentice Hall Of India, New Delhi, 2006.
Panneerselvam, R., Research Methodology, Prentice Hall Of India, New Delhi, 2004.
0 comments