2. The data set is from the first year that SAT scores were published on a state-by-state basis in the U.S. It was originally published in the Harvard Educational Review in 1984, and is also reported in Ramsey and Schafer, 1997. The variables included are:
sat =averagetotalSATscoreforthestate takers =percentofeligiblestudentsinthestatewhotooktheexam income =themedianfamilyincomeofstudentsinthestatewhotooktheexam years =theaveragenumberofyearsthatthetest-takers had for studies in the core subjects public =percentageoftesttakersattendingpublicsecondaryschools expend =thestatesexpendituresoneducationinhundredsofdollarsperstudent rank =themedianpercentilerankingofthetest-takers in their high-school class
Perform a multiple regression to predict the average SAT score in the state from the other variables, and answer the following questions. Present copies of the relevant portions of the R output, and for each question indicate which portion of the output you used and how you used it.
e) Which two states have the greatest effect on the estimated regression line? How would you classify their effect?
f) What subset of independent variables gives the best model for this problem?
g) How would you respond to someone who claimed that the variables not included in the model in part h did not affect the average SAT score of the states?
Answer & Explanation
Solved by verified expert
Get Answers to Unlimited Questions
Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!
Membership Benefits:
Unlimited Question Access with detailed Answers
Zin AI - 3 Million Words
10 Dall-E 3 Images
20 Plot Generations
Conversation with Dialogue Memory
No Ads, Ever!
Access to Our Best AI Platform: Flex AI - Your personal assistant for all your inquiries!