Model Diagnostics Political Analysis II 1 Why should we care about unusual observations? Because they can drive our results and lead to misleading findings (especially in small samples) To improve our theory and statistical model Three types of unusual observations:
Regression outliers High leverage observations Influential observations 2 A useful tool: Residuals 3 Regression outliers
Regression outliers have extreme values for Y given their values on X For example, oil-rich non-democracies Why? Coding error, peculiarity Effect: limited, but they can increase our standard errors Detect: large studentized residuals (> | 2|) Fix: Check coding, revise theory
Fox (2008) 4 Example of regression outliers (1) Remember that Lijphart excluded India and Israel from his analysis because they had extreme values on the dependent variable of political stability and absence of violence? (i.e. univariate outliers) 5
Example of regression outliers (2) Remember that Lijphart excluded India and Israel from his analysis because they had extreme values on the dependent variable of political stability and absence of violence? (i.e. univariate outliers) Only Israel, however, is a regression outlier! 6 High leverage observations
High leverage observations have Detect: hat values (measure extreme values on one or more based on the fitted/Y-hat values) independent variables. Effect: They can change the estimate of regression coefficients (if they dont follow the pattern of the data) Fox (2008)
7 Example of high leverage observations Lijphart described India as an extreme outlier, but it is actually a high leverage observation. 8 Example of high leverage
observations Lijphart described India as an extreme outlier, but it is actually a high leverage observation. 9 Example of high leverage observations We can see this clearly when we look at Indias very high hat-values.
10 Influential observations Influential observations have extreme Detect: studentized residuals versus values for X and Y leverage, Cooks Distance Influence = Outlierness X Leverage Fix: check coding, dummying out, re-run the model without the Effect: removing them from the model
observation(s) and compare results significantly changes the direction, strength, or significance of the results Fox (2008) 11 Example of influential observations No influential observations in Lijpharts sample
India: high hat-values, but small residuals Israel: large residuals, but low hatvalues We find influential observations in the lower-right corner and upperright corner (not shown here). 12
The infamous butterfly ballot Wand et al. (2001) show that more than 2,000 Democrats voted for Buchanan in Palm Beach County, a typically Democratic county, due to the butterfly ballot. This type of ballot was only used in this county and only for election-day for president. As a result, George W. Bush, and
not Al Gore, won Florida and the presidency. Kellstedt and Whitten (2013) 13 14 What should we do with unusual observations?
Dont ignore it Learn to understand why an observation is unusual If it is a coding error, recode it; if thats not possible, drop it Learn more about the observation and improve your model Transform the data (e.g. log transformation) More on nonlinear relationships next week 15 Multicollinearity Multicollinearity is the high
correlation between two or more independent variables. For example, height and weight Ideology and party ID Effect: larger standard errors ( Type II errors), unstable coefficients Detect: variance-inflation factor VIF (VIF > 10 are problematic) Fix: not easy, but increasing sample
size, centering variables, or combining variables into an index can help. Kellstedt and Whitten (2013: 238) 16 More articles on influential observations Fails and Krieckhaus (2010). Colonialism, Property Rights and the Modern World Income Distribution. British Journal of Political Science, 40(3), 487503. Data:
https://sites.google.com/a/oakland.edu/mfails/research/colonialism-prope rty-rights-and-the-modern-world-income-distribution Wand et al. (2001). The Butterfly Did It: The Aberrant Vote for Buchanan in Palm Beach County, Florida. American Political Science Review, 95(4), 793810. Data: h ttps://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/1038 9 17