# CS计算机代考程序代写 Exercises for the course

Exercises for the course

Machine Learning 1

Winter semester 2020/21

Abteilung Maschinelles Lernen Institut fu ̈r Softwaretechnik und theoretische Informatik Fakult ̈at IV, Technische Universit ̈at Berlin Prof. Dr. Klaus-Robert Mu ̈ller Email: klaus-robert.mueller@tu-berlin.de

Exercise Sheet 4

Exercise 1: Fisher Discriminant (10 + 10 + 10 P)

The objective function to find the Fisher Discriminant has the form max w⊤SBw

w w⊤SWw

where SB = (m2 − m1) (m2 − m1)⊤ is the between-class scatter matrix and SW is within-class scatter matrix, assumed to be positive definite. Because there are infinitely many solutions (multiplying w by a scalar doesn’t change the objective), we can extend the objective with a constraint, e.g. that enforces w⊤SW w = 1.

(a) Reformulate the problem above as an optimization problem with a quadratic objective and a quadratic constraint.

(b) Show using the method of Lagrange multipliers that the solution of the reformulated problem is also a solution of the generalized eigenvalue problem:

SBw = λSW w

(c) Show that the solution of this optimization problem is equivalent (up to a scaling factor) to

w⋆ = S−1(m1 − m2) W

Exercise 2: Bounding the Error (10 + 10 P)

The direction learned by the Fisher discriminant is equivalent to that of an optimal classifier when the class- conditioned data densities are Gaussian with same covariance. In this particular setting, we can derive a bound on the classification error which gives us insight into the effect of the mean and covariance parameters on the error.

Consider two data generating distributions P (x|ω1) = N (μ, Σ) and P (x|ω2) = N (−μ, Σ) with x ∈ Rd. Recall that the Bayes error rate is given by:

P (error|x) p(x) dx

P (error|x) ≤ P (ω1|x)P (ω2|x) (b) Show that the Bayes error rate can then be upper-bounded by:

P (error) ≤ P (ω1 )P (ω2 ) · exp − 1 μ⊤ Σ−1 μ 2

Exercise 3: Fisher Discriminant (10 + 10 P)

Consider the case of two classes ω1 and ω2 with associated data generating probabilities −1 2 0 +1 2 0

p(x|ω1)=N −1 , 0 1 and p(x|ω2)=N +1 , 0 1

(a) Find for this dataset the Fisher discriminant w (i.e. the projection y = w⊤x under which the ratio between

inter-class and intra-class variability is maximized).

(b) Find a projection for which the ratio is minimized.

Exercise 4: Programming (30 P)

Download the programming files on ISIS and follow the instructions.

P (error) =

(a) Show that conditional error can be upper-bounded as:

x