# CS计算机代考程序代写 Bayesian Machine Learning 1 TU Berlin, WiSe 2020/21

Machine Learning 1 TU Berlin, WiSe 2020/21
Maximum Likelihood Parameter Estimation
In this first exercise, we would like to use the maximum-likelihood method to estimate the best parameter of a data density model p(x|θ) with respect to some dataset D = (x1,…,xN), and use that approach to build a classifier. Assuming the data is generated independently and identically distributed (iid.), the dataset likelihood is given by
N
p(D|θ) = 􏰉 p(xk|θ)
k=1 and the maximum likelihood solution is then computed as
θˆ=argmax p(D|θ) θ
=argmax logp(D|θ) θ
where the log term can also be expressed as a sum, i.e.
N
log p(D|θ) = 􏰃 log p(xk |θ).
k=1
As a first step, we load some useful libraries for numerical computations and plotting.
In [1]: import numpy
import matplotlib
%matplotlib inline
from matplotlib import pyplot as plt
na = numpy.newaxis
We now consider the univariate data density model
p(x|θ) = 1 1
π 1 + (x − θ)2
also known as the Cauchy distribution with fixed parameter γ = 1, and with parameter θ unknown. Compared to the Gaussian distribution, the Cauchy distribution is heavy-tailed, and this can be useful to handle the presence of outliers in the data generation process. The probability density function is implemented below.
In [2]: def pdf(X,THETA):
return (1.0 / numpy.pi) * (1.0 / (1+(X-THETA)**2))
Note that the function can be called with scalars or with numpy arrays, and if feeding arrays of different shape, numpy broadcasting rules will apply. Our first step will be to implement a function that estimates the optimal parameter θˆ in the maximum likelihood sense for some dataset D.
• Implement a function that takes a dataset D as input (given as one-dimensional array of numbers) and a list of candidate parameters θ (also given as a one-dimensional array), and returns a one- dimensional array containing the log-likelihood w.r.t. the dataset D for each parameter θ.
In [3]: def ll(D,THETA):
# ————————————–
# TODO: replace by your code
# ————————————– import solution; return solution.ll(D,THETA) # ————————————–
1

To test the method, we apply it to some dataset, and plot the log-likelihood for some plausible range of parameters θ.
In [4]: D = numpy.array([ 2.803, -1.563, -0.853, 2.212, -0.334, 2.503])
THETA = numpy.linspace(-10,10,1001)
plt.grid(True)
plt.plot(THETA,ll(D,THETA))
plt.xlabel(r’\$ heta\$’)
plt.ylabel(r’\$log p(mathcal{D}| heta)\$’)
plt.show()
We observe that the likelihood has two peaks: one around θ = −0.5 and one around θ = 2. However, the highest peak is the second one, hence, the second peak is retained as a maximum likelihood solution.
Building a Classifier
We now would like to use the maximum likelihood technique to build a classifier. We consider a labeled dataset where the data associated to the two classes are given by:
In [5]: D1 = numpy.array([ 2.803, -1.563, -0.853, 2.212, -0.334, 2.503])
D2 = numpy.array([-4.510, -3.316, -3.050, -3.108, -2.315])
To be able to classify new data points, we consider the discriminant function
g(x) = logP(x|θˆ1)−logP(x|θˆ2)+logP(ω1)−logP(ω2)