Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: MPIBR-wijngaardenj/grade-stat-learning
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: MPIBR-anneserl/grade-stat-learning
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
Able to merge. These branches can be automatically merged.
  • 1 commit
  • 2 files changed
  • 1 contributor

Commits on Jul 16, 2018

  1. Copy the full SHA
    fa40dcd View commit details
Showing with 1,556 additions and 0 deletions.
  1. +37 −0 exercises_answers.txt
  2. +1,519 −0 exercises_test.ipynb
37 changes: 37 additions & 0 deletions exercises_answers.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
3.7.1
Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the
linear model.
=>
H_0: b_i = 0, i.e. b_0 = 0 means that in the absence of any advertising budget the sales are 0, while for the other b_i it menas there is no linear (!) relationship between sales and the respective advertising budget.
Conclusion: it is very unlikely that there are no sales in the absence of any budget and it is also very unlikely that there is no linear relationship between both TV advertisting and sales and radio advertising and sales. On the other hand one can not reject the hypothesis that there is no linear relationship between newspaper advertising and sales.

3.7.2.
Carefully explain the differences between the KNN classifier and KNN regression methods.
=> regression: find modell function f^(X) which fits the data Y and allows preditions,
classifier: find classification boundaries which classifies data into different qualitative classes

3.7.3.
a) (assuming statistical significance):
1) wrong (beta_3 is positive and female=1)
2) right (see 1)
3) right (since the interaction term GPA*Gender is negative and female=1,
it means that with increasing GPA salary rises higher for male than for female)
4) wrong (see 3)
b) 50 + 4*20 + 110*0.07 + 0.01*110*4.0 - 10*4 = 102.1
c) partly false: the effect might be small but with high enough N it could still be significant.
furthermore non-linear interaction terms can not be ruled out

3.7.4
a) RSS for cubic is smaller since additional parameters reduce RSS (at least if it is not zerio already)
b) RSS for cubic almost certainly increases (overfitting)
c) same as a)
d) RSS is smaller

3.7.9
c) Problem: intercept is negative


Questions:
- how is correlation related to the slope of a simple linear regression? (should be slope when not incorporating beta_0 or?)


Loading