# 机器学习｜统计｜回归分析｜DS｜Python

# MECH 203

MECH 203

Week 7 Jupyter Notebook Written Report

LINEAR REGRESSION

Due date: 11:59PM on Tuesday, March 3rd, 2020

Grading & Weight: This assignment is out of 50 marks, as further specified in the mark

breakdown for each question. The assignment is worth 6% of your overall final grade in the

course.

Late Penalty: Late submissions will be penalized at 10% each day for up to 5 days, in which

case a grade of zero will be given.

1. Overview

This assignment is about applying simple linear regression to interpret data sets on model

rockets, thermal expansion of Al and hybrid cars in Questions 1, 2, 3, respectively.

Before you start working on the assignment make sure you:

Review the online lecture videos, in-class lecture slides and the required reading.

This assignment aligns with the following CLO’s:

CLO 4: Implement simple linear regression

CLO 4: Implement simple linear regression with error bars

CLO 4: Apply a statistical test to compare regression models

1.1 Time for completion

This assignment will take approximately 6 hours to complete.

2. Instructions

For each question the corresponding data is available both as *.csv and

*.dat files, comma-separated values file and tab-delimited text file, respectively.

The *.csv and *.dat files with the same name contain the same data.

When you have completed the assignment, upload the Jupyter Notebook

file to onQ.

TASKS

Question 1

The “Q1_rocket_data” file contains data on the performance of model rockets constructed by

MME students. Each line represents a rocket launch, where the X and Y correspond to the

*pressure of the propellant gas (measured in psi) and the maximum height (apogee, measured inm) reached by the rocket, respectively.By applying linear regression to this data we can create an empirical model which can predict theMECH 203作业代做、代写Python课程作业expected apogee of such a rocket from the pressure of the propellant. To perform the linearregression follow the steps below:a. Calculate the values of 𝑥̅, 𝑦̅, 𝑥𝑦̅̅̅, 𝑥̅̅2̅, 𝑦̅̅2̅ (2/50)b. Calculate the regression coefficients 𝛽̂0, 𝛽̂1 for the best fitting regression line using thequantities above (2/50)c. Calculate the sum of squares corresponding to the best fitting regression line (3/50)d. Calculate the standard error of the regression coefficients 𝛽̂0, 𝛽̂1 and comment on theirvalue (3/50)e. Make a plot which shows the data points and the best fitting regression line (2/50)f. Calculate the 𝑅2and comment on its value (i.e. interpret its meaning) (2/50)g. Perform the linear fit using Python (e.g. numpy.polyfit) and compare the coefficients 𝛽̂0,𝛽̂1 for the best fitting regression line and the 𝑅2 value obtained this way to the valuesobtained above (2/50).(The data was provided by Prof. Surgenor.)Question 2The “Q2_Al-thermal-expansion_data” file contains data collected using neutron scattering onthe crystal lattice parameter of an Al-based composite as a function of temperature (i.e. the datais on the thermal expansion of the material). Each line represents a measurement, where the Xand Y correspond to the crystal lattice parameter (measured in Angstroms, 1 Angstrom = 10-10m) and the temperature (measured in C), respectively.By applying linear regression to this data we can create an empirical model which can predict theexpected lattice parameter of this Al-composite if the temperature of the material is known. Toperform the linear regression follow the steps below:a. Calculate the values of 𝑥̅, 𝑦̅, 𝑥𝑦̅̅̅, 𝑥̅̅2̅, 𝑦̅̅2̅ (2/50)b. Calculate the regression coefficients 𝛽̂0, 𝛽̂1 for the best fitting regression lineusing the quantities above (2/50)c. Calculate the sum of squares corresponding to the best fitting regression line (3/50)d. Calculate the standard error of the regression coefficients 𝛽̂0, 𝛽̂1 and comment on theirvalue (3/50)e. Make a plot which shows the data points and the best fitting regression line (2/50)f. Calculate the 𝑅2and comment on its value (i.e. interpret its meaning) (2/50)g. Perform the linear fit using Python (e.g. numpy.polyfit) and compare the coefficients 𝛽̂0,𝛽̂1 for the best fitting regression line and the 𝑅2 value obtained this way to the valuesobtained above (2/50).(The data was collected by E. Tulk.)Question 3The “Q3_hybrid-cars_data” file contains data on hybrid cars from various manufacturers whichcame out in the years between 1997 and 2013. Each line represents a specific car. The columnsdenoted year, msrp, accelrate and mpg represent the model year, the manufacturer’s suggestedretail price in 2013 $, the maximum acceleration rate in km/hour/second and the fuel economy inmiles/gallon, respectively.Using this data set we would like to investigate how the characteristics listed above correlatewith each other. Use 𝑅2to quantify and investigate these correlations while answering thequestions below:a. How much does the year the car was manufactured affect its retail price? I.e. what is the𝑅2for year vs msrp? Make a plot which shows the data points and the best fittingregression line (3/50)b. How much does the retail price of the car affect its maximum acceleration rate? I.e. whatis the 𝑅2for msrp vs accelrate? Make a plot which shows the data points and the bestfitting regression line (3/50)c. How much does the fuel economy of the car affect its maximum acceleration rate? I.e.what is the 𝑅2for mpg vs accelrate? Make a plot which shows the data points and thebest fitting regression line (3/50)d. How much does the year the car was manufactured affect its fuel economy? I.e. what isthe 𝑅2for year vs mpg? Make a plot which shows the data points and the best fittingregression line (3/50)e. Compare the 𝑅2 values obtained above and comment on their relative value: In yoursubjective opinion which cases from above show a noteworthy effect (correlation) andwhich don’t? Explain why (6/50).(Note: the value of 𝑅2is independent of the choice of the response and regressor variables forthe data set pairs above, i.e. X vs Y and Y vs X have identical 𝑅2values. (This does not hold forthe regression coefficients and sum of squares)).(Source of the data: D-J. Lim, S.R. Jahromi, T.R. Anderson, A-A. Tudorie (2014) “ComparingTechnological Advancement of Hybrid Electric Vehicles (HEV) in Different Market Segments”,Technological Forecasting & Social Change, http://dx.doi.org/10.1016/j.techfore.2014.05.008)3. EvaluationThe report will be evaluated based on the completeness of the answers/solutions providedfor each question.*