Skip to content

Commit

Permalink
Chapter3 - linear regression
Browse files Browse the repository at this point in the history
  • Loading branch information
MPIBR-anneserl authored Jul 16, 2018
1 parent a494123 commit fa40dcd
Show file tree
Hide file tree
Showing 2 changed files with 1,556 additions and 0 deletions.
37 changes: 37 additions & 0 deletions exercises_answers.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
3.7.1
Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the
linear model.
=>
H_0: b_i = 0, i.e. b_0 = 0 means that in the absence of any advertising budget the sales are 0, while for the other b_i it menas there is no linear (!) relationship between sales and the respective advertising budget.
Conclusion: it is very unlikely that there are no sales in the absence of any budget and it is also very unlikely that there is no linear relationship between both TV advertisting and sales and radio advertising and sales. On the other hand one can not reject the hypothesis that there is no linear relationship between newspaper advertising and sales.

3.7.2.
Carefully explain the differences between the KNN classifier and KNN regression methods.
=> regression: find modell function f^(X) which fits the data Y and allows preditions,
classifier: find classification boundaries which classifies data into different qualitative classes

3.7.3.
a) (assuming statistical significance):
1) wrong (beta_3 is positive and female=1)
2) right (see 1)
3) right (since the interaction term GPA*Gender is negative and female=1,
it means that with increasing GPA salary rises higher for male than for female)
4) wrong (see 3)
b) 50 + 4*20 + 110*0.07 + 0.01*110*4.0 - 10*4 = 102.1
c) partly false: the effect might be small but with high enough N it could still be significant.
furthermore non-linear interaction terms can not be ruled out

3.7.4
a) RSS for cubic is smaller since additional parameters reduce RSS (at least if it is not zerio already)
b) RSS for cubic almost certainly increases (overfitting)
c) same as a)
d) RSS is smaller

3.7.9
c) Problem: intercept is negative


Questions:
- how is correlation related to the slope of a simple linear regression? (should be slope when not incorporating beta_0 or?)


Loading

0 comments on commit fa40dcd

Please sign in to comment.