Chapter 7:  TOPICS IN REGRESSION ANALYSIS

 

1.  Proxy variables:  least squares estimators are unbiased only when all relevant independent variables are specified.  Sometimes difficulty in measuring influences requires the use of proxy variables.  Examples include:

            a.  Use of measures of consumer confidence to measure expectations.

            b.  Use of unemployment rate to measure expected income.

Proxy variables are more useful in forecasting with time series data since they may reflect the change in the theoretical variable being measured.  However, they may be misleading when used as an explanatory model for simulation. 

 

2.  Dummy variables:  used to represent the influence of qualitative variables (gender, age, market segments, special events in time series (war years), and seasonal variation.  It can also be used to test for structural stability in intercept and slopes.  A dummy variable is indicated when a consistent set of outliers exist that may be due to some irregular influence.

 

3.  Selecting the best forecasting model:  The best model includes all of the important explanatory variables and does not include variables that are not important explanatory variables.  Hence, the t-values of all included variables indicate that a significant causal effect exists for that independent variable.  Two techniques of deciding on the best model when several possible independent variables exists are:

            a.  Estimate all the possible regression equations and select the one with the highest adjusted R-squared (lowest standard error of the regression) with all of the t-values significant.

            b.  Use stepwise regression and allow the computer to select in succession the independent variables for which the additional explained variance is significant.

 

4.  Lagged variables.  There are two types of lagged relationships that may be introduced into the equation. 

            a.  Simple lagged relationships use only one time period for the independent variable with the lag in that time period based upon the highest correlation (best fit) with the dependent variable.

            b.  Distributed lags use more than one time period for the independent variable due to delayed responses that are spread out over a large number of periods.  The weights applied to each lagged value of the independent variable may be determined by the regression coefficients.  (See example 7-11 on page 325).  An example is the forecast of changes in quarterly consumption spending based upon past quarterly changes in disposable personal income as well as the current rate of interest.  (See example 7-12)  The weighted average of past changes in income represents an estimate of expected income. 

            In the material to follow we will see that autoregressive methods used combinations of present and past values of an observation in order to forecast its future value.  Hence, this adaptive expectations model is really a combination of autoregressive techniques with regression (causal) technique.