Exercises for the course
Machine Learning 1
Winter semester 2021/22
fu ̈r Softwaretechnik und theoretische ̈at IV, ̈at Berlin Prof. Dr. Klaus- ̈ller Email:
Copyright By cscodehelp代写 加微信 cscodehelp
Exercise Sheet 14
Exercise 1: Class Prototypes (25 P)
Consider the linear model f(x) = w⊤x + b mapping some input x to an output f(x). We would like to interpret the function f by building a prototype x⋆ in the input domain which produces a large value f. Activation maximization produces such interpretation by optimizing
max f (x) + Ω(x). x
Find the prototype x⋆ obtained by activation maximization subject to Ω(x) = log p(x) with x ∼ N (μ, Σ) where μ and Σ are the mean and covariance.
Exercise 2: Shapley Values (25 P)
Consider the function f(x) = min(x1,max(x2,x3)). Compute the Shapley values φ1,φ2,φ3 for the prediction f(x) with x = (1,1,1). (We assume a reference point x = 0, i.e. we set features to zero when removing them from the coalition).
Exercise 3: (25 P)
Consider the simple radial basis function
φi = ∥x−μ∥2 ·(∥x−μ∥−θ) Exercise 4: Layer-Wise Relevance Propagation (25 P)
f(x) = ∥x − μ∥ − θ
with θ > 0. For the purpose of extracting an explanation, we would like to build a first-order Taylor expansion of the function at some root point x. We choose this root point to be taken on the segment connecting μ and x (we assume that f(x) > 0 so that there is always a root point on this segment).
Show that the first-order terms of the Taylor expansion are given by (xi − μi)2
We would like to test the dependence of layer-wise relevance propagation (LRP) on the structure of the neural network. For this, we consider the function y = max(x1,x2), where x1,x2 ∈ R+ are the input activations. This function can be implemented as a ReLU network in multiple ways. Three examples are given below.
x1 a3 (c) 1 a3
1-1 yout x 1 0.5 x2 a4 1 1 1 0.5
1 -1 a4 yout x1 a3 1 x2 -1 0.5
-1 yout x21a41 1
where j and k are indices for two consecutive layers and where ()+ denotes the positive part. This propagation rule is applied to both layers.
Give for each network the computational steps that lead to the scores R1 and R2, and the obtained relevance values. More specifically, express R1 and R2 as a function of R3 and R4 (and R5), and express the latter relevances as a function of Rout = y.
We consider the propagation rule:
Rj = ajw+ Rk k j jk
程序代写 CS代考 加微信: cscodehelp QQ: 2235208643 Email: firstname.lastname@example.org