# 程序代写代做代考 algorithm data science Introduction to information system

Model Selection

Bowei Chen

School of Computer Science

University of Lincoln

CMP3036M/CMP9063M Data Science

• Basic Setup of the Learning from Data

• Cross-Validation Methods

• Appendix A: Testing-Based/Stepwise Procedures

• Appendix B: Criterion-Based Procedures

Today’s Objectives

Limitation of Linear Regression

Price Fullbase

1 420 1

2 385 0

3 495 0

4 605 0

5 610 0

6 660 1

7 660 1

8 690 0

9 838 1

10 885 0

… … …

Housing dataset

Response

variable
Predictor

Logit Function and Odds Ratio

The logit function of 𝑝, where 𝑝 is between 0 and 1, can be expressed as

logit 𝑝 = log
𝑝

1 − 𝑝
= log 𝑝 − log 1 − p

𝑝

1−𝑝
is called odds ratio

If 𝑝 = 0, logit 𝑝 → −∞

If 𝑝 = 1, logit 𝑝 → ∞

Logistic Function

The logit function is the inverse of logistic function. If we let 𝛼 = logit 𝑝 , then

logistic 𝛼 = logit−1 𝑝 =
1

e−𝛼 + 1
=

𝑒𝛼

1 + 𝑒𝛼

𝑒𝛼

1 + 𝑒𝛼

log
𝑝

1 − 𝑝

Simple Logistic Regression

The logit of the underlying probability 𝑝𝑖 is a linear function of the predictors

logit 𝑝𝑖 = 𝛽0 + 𝛽1𝑥𝑖 ,

then

𝑝𝑖 =
1

1 + 𝑒−(𝛽0+𝛽1𝑥𝑖)
=

𝑒𝛽0+𝛽1𝑥𝑖

1 + 𝑒𝛽0+𝛽1𝑥𝑖
.