Instructions

• Assignment needs to be turned in as Rmarkdown, and as html, to moodle. That is, two files need to be submitted.
• You need to list your team members on the report. For each of the four assignments, one team member needs to be nominated as the leader, and is responsible for coordinating the efforts of other team members, and submitting the assignment.
• It is strongly recommended that you individually complete the assignment, and then compare your answers and explanations with your team mates. Each student will have the opportunity to report on other team member’s efforts on the assignment, and if a member does not substantially contribute to the team submission they may get a reduced mark, or even a zero mark.
• R code should be hidden in the final report, unless it is specifically requested.
• Original work is expected. Any material used from external sources needs to be acknowledged.
• To make it a little easier for you, a skeleton of R code is provided in the Rmd file. Where you see ??? means that something is missing and you will need to fill it in with the appropriate function, argument or operator. You will also need to rearrange the code as necessary to do the calculations needed.

Marks

• Total mark will be out or 25
• 5 points will be reserved for readability, and appropriate citing of external sources
• 5 points will be reserved for reproducibility, that the report can be re-generated from the submitted Rmarkdown.
• Accuracy and completeness of answers, and clarity of explanations will be the basis for the remaining 15 points.

Exercises

1. Here we explore the maximal margin classifier on a toy data set.
1. We are given $$n = 7$$ observations in $$p = 2$$ dimensions. For each observation, there is an associated class label. Sketch the observations.
2. Sketch the optimal separating hyperplane, and provide the equation for this hyperplane in the form of textbook equation 9.1.
3. Write the classification rule for the maximal margin classifier.
4. On your sketch, indicate the margin for the maximal margin hyperplane.
5. Indicate the support vectors for the maximal margin classifier.
6. Argue that a slight movement of observation 4 would not affect the maximal margin hyperplane.
7. Sketch a separating hyperplane that is not the optimal separating hyper- plane, and provide the equation for this hyperplane.
8. How would the separating hyperplane change if an 8th observation (8, 2, 2.5, 1) as added to the data? List the support vectors, and write the equation of the plane.
1. Bagging and boosting on the ISLR caravan data set. Questions 1-4 from lab 8.

2. Now scramble the response variable (Purchase) using permutation. The resulting data has no true relationship between the response and predictors. Re-do Q2 with this data set. Write a paragraph explaining what you learn about the true data from analysing this permuted data.