Introduction to Machine Learning and
Office Hours: By Appointment, Regular Zoom Office Hours TBD Lectures: Tuesday and Thursday, 10:00am-11:30am, Zoom Link Course Website: eClass
Course Chat: MS Teams
Live lectures and Q&A sessions will be delivered on Tuesdays and Thursdays via Zoom. Zoom sessions will be recorded and made available afterwards. Tentatively, lectures will be delivered on Tuesdays while Thursdays will primarily be an interactive review and Q&A session.
Machine learning is the study of algorithms that learn how to perform a task from prior experience. Machine learning algorithms find widespread application in diverse problem areas, including machine perception, natural language processing, search engines, medical diagnosis, bioinformatics, brain-machine interfaces, financial analysis, gaming and robot navigation. This course will thus provide students with marketable skills and also with a foundation for further, more in-depth study of machine learning topics.
This course introduces the student to machine learning concepts and techniques applied to pattern recognition problems. The course takes a probabilistic perspective, but also incorporates a number of non-probabilistic techniques.
Upon completing this course the student will, through the assignments and tests, have demonstrated an ability to:
¡ñ Use probabilistic modeling and statistical analysis of data to develop powerful pattern recognition algorithms
¡ñ Identify machine learning models and algorithms appropriate for solving specific problems
¡ñ Explain the essential ideas behind core machine learning models and algorithms
¡ñ Identify the main limitations and failure modes of core machine learning models and
¡ñ Program moderately complex machine learning algorithms
¡ñ Manage data and evaluate and compare algorithms in a supervised learning setting
¡ñ Access and correctly employ a variety of machine learning toolboxes currently available
¡ñ Identify a diversity of pattern recognition applications in which machine learning
techniques are currently in use.
EECS 2030 (Advanced Object Oriented Programming). One of MATH2030 (Elementary Probability) or MATH1131 (Introduction to Statistics). MATH1025 (Applied Linear Algebra), MATH1021 (Linear Algebra I) or a similar introductory course in linear algebra is strongly recommended. In general this course will require you to be familiar with and use basic concepts of linear algebra, calculus and probability. Some basics will be reviewed during the course and background material will be provided, but be prepared to do some extra reading and practice if needed.
You will also need to solve programming assignments using Python along with Numpy/Scipy/Matplotlib/scikit-learn. Background material about Python will be provided but you will need to familiarize yourself with the language and libraries. If you¡¯ve not used Python previously, you should immediately begin learning it.
There will be readings assigned for each week of lectures from the textbook:
Probabilistic Machine Learning: An Introduction, by . Murphy. MIT Press (2022).
This book is not yet published, but we will be using a draft version available online here:
There are a number of other excellent machine learning books which I list below as possible references for you to consider:
¡ñ Pattern Recognition and Machine Learning, by .Bishop. Springer (2006).
¡ñ Understanding Machine Learning by -Shwartz and -David.
Cambridge University Press (2014).
¡ñ Information Theory, Inference, and Learning Algorithms, by Kay (2003). http://www.inference.org.uk/mackay/itila/book.html
This course will require you to apply numerous concepts from computer science, mathematics and statistics. This includes programming in python with Jupyter Notebooks, NumPy, etc, working with matrices and vectors, using gradients and other concepts from calculus, understanding uncertainty through probabilities and more. Below is a curated set of resources which should provide help in refreshing and bolstering your familiarity with these concepts.
¡ñ The Missing Semester of Your CS Education
¡ñ Stanford CS231n Python Tutorial with Google Colab
¡ñ Scientific Computing in Python: Introduction to NumPy and Matplotlib
¡ñ Mathematics for Machine Learning by Deisenroth, A. , and Ong.
Evaluation and Grading
Evaluation will be a mix of assignments and tests. There will be three assignments, a take-home mid-term and a take-home final exam. Assignments will be a mix of theoretical problems and practical programming problems. All assignments are expected to be done individually. For undergraduate students (enrolled in EECS4404) they are weighted as:
¡ñ 40% Assignments
¡ñ 20% Take-Home Midterm
¡ñ 40% Take-Home Final
For graduate students (enrolled in EECS5327) there is an additional presentation component on topics to be determined by the student in consultation with the professor. In this case, the grades are weighted as:
¡ñ 40% Assignments
¡ñ 15% Take-Home Midterm
¡ñ 10% Presentations
¡ñ 35% Take-Home Final
Marked assignments will be returned as soon as possible. Students should immediately review the grading and confirm that marks have been recorded properly. If there are issues identified with the marking, students should contact the marker first to clarify the issue. If the student believes that an assignment was improperly marked, a remark request must be submitted in writing to both the TA and instructor by email within two weeks of receiving the graded materials. Remark requests received after two weeks will not be considered.
Missed Term Work and Accomodations
There will be no extensions or partial credit for assignments which are submitted late. Students who are unable to submit an assignment for a legitimate reason (e.g., illness or emergency) must contact the instructor as soon as possible to explain the situation, provide evidence in support and discuss options.
All students at York University are bound by the York University Senate Policy on Academic Honesty. You should also be aware of the Department of Electrical Engineering and Computer Sciences own Academic Honesty Guidelines. The ¡°tl;dr¡± is that all work that you submit as your own must be purely your own. Any use of another individual’s work as your own without appropriate permission or attribution or the use of prohibited aids and resources in the completion of your academic work is a violation and potentially carries with it serious penalties up to and including expulsion from the university. If you have not done so, I would strongly encourage everyone to review the Academic Integrity module to get an understanding of what academic integrity means and why it is important. When completing graded assessments (e.g., tests and assignments) students will be expected to affirm that they are aware of the policies on academic honesty and the potential punishments for their violation and state that all work submitted is their own. Students that are caught violating these policies in this course will be investigated and suitable punishments will be applied.
The following is a tentative schedule of assignments, lectures and topics. This is subject to revision as the course progresses.
Thu, Sep 9
Intro and Motivation
Tue, Sep 14
Background Review: Probability, Statistics, Linear Algebra and Calculus
Thu, Sep 16
Tue, Sep 21
Discrete Prediction and Classification
Thu, Sep 23
A1 Part 1 Due
Tue, Sep 28
Continuous Prediction and Regression: Linear Regression
Thu, Sep 30
A1 Part 2 Due
Tue, Oct 5
Continuous Prediction and Regression: Logistic Regression
Thu, Oct 7
Tue, Oct 12
Thu, Oct 14
Tue, Oct 19
Continuous Prediction and Regression: Regularization
Thu, Oct 21
A2 Part 1 Due
Tue, Oct 26
Neural Networks & SGD
Take-home Midterm (tentative)
Thu, Oct 28
Tue, Nov 2
Thu, Nov 4
A2 Part 2 Due
Tue, Nov 9
Unsupervised Learning: Dimensionality Reduction
Thu, Nov 11
Tue, Nov 16
Unsupervised Learning: Clustering
Thu, Nov 18
A3 Part 1 Due
Tue, Nov 23
Nonparametric Methods: Decision Trees and Boosting
Thu, Nov 25
Tue, Nov 30
Advanced Topics (TBD)
Thu, Dec 2
Tue, Dec 7
Advanced Topics (TBD)
A3 Part 2 Due