# CS代考 CSC 311: Introduction to Machine Learning – cscodehelp代写

CSC 311: Introduction to Machine Learning
Lecture 3 – Linear Classification
Based on slides by Amir-massoud Farahmand & Emad A.M. Andrews
Intro ML (UofT) CSC311-Lec3 1 / 39

Last class, we discussed linear regression, and used a modular approach to machine learning algorithm design:
chooseamodel: y=f(x)=w⊤x+b
choose a loss: L(y, t) = 12 (y − t)2
formulate an optimization problem: minimize 􏰊Ni=1 L(y(i), t(i)) (w,b)
solve the minimization problem using one of two strategies
􏰀 direct solution (set derivatives to zero)
vectorize the algorithm, i.e. represent in terms of linear algebra make a linear model more powerful using feature expansion improve the generalization by adding a regularizer
Intro ML (UofT) CSC311-Lec3

Classification Setup
Classification: predicting a discrete-valued target
􏰀 Binary classification: predicting a binary-valued target
Recall the notation:
􏰀 Training data: (x(1) , t(1) ), (x(2) , t(2) ), … (x(N ) , t(N ) )
􏰀 x(i) are the inputs
􏰀 t(i) are the (discrete value) targets
􏰀 predict whether a patient has a disease, given the presence or
absence of various symptoms
􏰀 classify e-mails as spam or non-spam
􏰀 predict whether a financial transaction is fraudulent
Intro ML (UofT) CSC311-Lec3

The Binary Linear Classification Model
classification: predict a discrete-valued target binary: predict a binary target t ∈ {0, 1}
􏰀 Training examples with t = 1 are called positive examples, and training examples with t = 0 are called negative examples.
􏰀 t ∈ {0, 1} or t ∈ {−1, +1} is for computational convenience.
Intro ML (UofT) CSC311-Lec3

The Binary Linear Classification Model
classification: predict a discrete-valued target binary: predict a binary target t ∈ {0, 1}
􏰀 Training examples with t = 1 are called positive examples, and training examples with t = 0 are called negative examples.
􏰀 t ∈ {0, 1} or t ∈ {−1, +1} is for computational convenience. linear: model is a linear function of x, followed by a threshold r:
z = wT x + b
y= 0 ifz0⇐⇒w0>0
􏰀 Whenx1=1,need:z=w0x0+w1x1<0⇐⇒w0+w1<0 This is our “training set” Intro ML (UofT) CSC311-Lec3 x0 x1 t 101 110 What conditions are needed on w0,w1 to classify all examples? 􏰀 Whenx1=0,need:z=w0x0+w1x1>0⇐⇒w0>0
􏰀 Whenx1=1,need:z=w0x0+w1x1<0⇐⇒w0+w1<0 Example solution: w0 = 1, w1 = −2 Is this the only solution? This is our “training set” Intro ML (UofT) CSC311-Lec3 x0 x1 x2 t 1000 1010 1100 1111 z = w0x0 + w1x1 + w2x2 CSC311-Lec3 x0 x1 x2 t 1000 1010 1100 1111 z = w0x0 + w1x1 + w2x2 need: w0 < 0 CSC311-Lec3 x0 x1 x2 t 1000 1010 1100 1111 z = w0x0 + w1x1 + w2x2 need: w0 < 0 need: w0 +w2 <0 CSC311-Lec3 x0 x1 x2 t 1000 1010 1100 1111 z = w0x0 + w1x1 + w2x2 need: w0 < 0 need: w0 +w2 <0 need: w0 +w1 <0 CSC311-Lec3 x0 x1 x2 t 1000 1010 1100 1111 z = w0x0 + w1x1 + w2x2 need: w0 need: w0 +w2 need: w0 +w1 need: w0 +w1 +w2 < 0 <0 <0 >0
CSC311-Lec3

x0 x1 x2 t 1000 1010 1100 1111
z = w0x0 + w1x1 + w2x2 need: w0 < 0 need: w0 +w2 <0 need: w0 +w1 <0 need: w0 +w1 +w2 >0
Example solution: w0 = −1.5, w1 = 1, w2 = 1
(UofT) CSC311-Lec3

The Geometric Picture
Input Space, or Data Space for NOT example
x0 x1 t 101 110
Training examples are points
Weights (hypotheses) w can be represented by half-spaces H+ ={x:wTx≥0},H− ={x:wTx<0} 􏰀 The boundaries of these half-spaces pass through the origin (why?) Intro ML (UofT) CSC311-Lec3 9 / 39 The Geometric Picture Input Space, or Data Space for NOT example x0 x1 t 101 110 Training examples are points Weights (hypotheses) w can be represented by half-spaces H+ ={x:wTx≥0},H− ={x:wTx<0} 􏰀 The boundaries of these half-spaces pass through the origin (why?) The boundary is the decision boundary: {x : wT x = 0} 􏰀 In 2-D, it is a line, but think of it as a hyperplane If the training examples can be perfectly separated by a linear decision rule, we say data is linearly separable. Intro ML (UofT) CSC311-Lec3 9 / 39 The Geometric Picture Weight Space Weights (hypotheses) w are points Each training example x specifies a half-space w must lie in to be correctly classified: wT x > 0 if t = 1.
w0 > 0 w0 + w1 < 0 Intro ML (UofT) CSC311-Lec3 10 / 39 The Geometric Picture Weight Space Weights (hypotheses) w are points w0 > 0 w0 + w1 < 0 Each training example x specifies a half-space w must lie in to be correctly classified: wT x > 0 if t = 1.
For NOT example:
􏰀 x0 =1,x1 =0,t=1 =⇒ (w0,w1)∈{w:w0 >0}
􏰀 x0 =1,x1 =1,t=0 =⇒ (w0,w1)∈{w:w0 +w1 <0} Intro ML (UofT) CSC311-Lec3 10 / 39 The Geometric Picture Weight Space Weights (hypotheses) w are points w0 > 0 w0 + w1 < 0 Each training example x specifies a half-space w must lie in to be correctly classified: wT x > 0 if t = 1.
For NOT example:
􏰀 x0 =1,x1 =0,t=1 =⇒ (w0,w1)∈{w:w0 >0}
􏰀 x0 =1,x1 =1,t=0 =⇒ (w0,w1)∈{w:w0 +w1 <0} The region satisfying all the constraints is the feasible region; if this region is nonempty, the problem is feasible, otw it is infeasible. Intro ML (UofT) CSC311-Lec3 10 / 39 The Geometric Picture The AND example requires three dimensions, including the dummy one. To visualize data space and weight space for a 3-D example, we can look at a 2-D slice. The visualizations are similar. 􏰀 Feasible set will always have a corner at the origin. Intro ML (UofT) CSC311-Lec3 11 / 39 The Geometric Picture Visualizations of the AND example Data Space -Sliceforx0 =1and - example sol: w0 =−1.5, w1 =1, w2 =1 - decision boundary: w0x0+w1x1+w2x2=0 =⇒ −1.5+x1+x2=0 Intro ML (UofT) CSC311-Lec3 12 / 39 The Geometric Picture Visualizations of the AND example Data Space -Sliceforx0 =1and - example sol: w0 = −1.5, w1 = 1, w2 = 1 - decision boundary: w0x0+w1x1+w2x2=0 =⇒ − 1 . 5 + x 1 + x 2 = 0 Weight Space -Sliceforw0 =−1.5forthe constraints - w 0 + w 1 < 0 - w0 + w1 + w2 > 0
Intro ML (UofT)
CSC311-Lec3

The Geometric Picture
Some datasets are not linearly separable, e.g. XOR
Intro ML (UofT) CSC311-Lec3 13 / 39

Summary: binary linear classifiers
Binary Linear Classifiers. Targets t ∈ {0, 1} z = wT x + b