This page is for the new (2018) course on Machine Learning.
Class Notices:
The class outline is here as a PDF file, and here in HTML.
A brief list of the topics we will treat is as follows.
Method of Evaluation:
Last year, we decided that the most useful way of evaluating our abilities with machine learning would be to get hold of a suitable data set and train an algorithm for some appropriate task. It was also decided that, although individual students are very welcome to carry out a project alone, it is also quite in order for students to work in pairs, and submit one single project for the two participants. I will propose a similar method of evaluation this year.
Resources:
We will follow the textbook Deep Learning, by Goodfellow, Bengio, and Courville. I do not expect to have time to cover any more than the first two main sections of the book. The book is available online at this site. It is also available as a hardback physical book.
We will also use Hands-on Machine Learning with SciKit-Learn, Keras and TensorFlow, Aurélien Géron, O'Reilly 2017 in parallel with the other textbook. This is a revised and updated version of a book that for the past two years has turned out to be a really valuable resource.
Here are other resources. See the course outline for more information about them.
More resources
This link is to a set of short additional notes for the course. It begins with a discussion of the singular value decomposition, generalised matrix inverses, and principal components analysis.
The resources and links given here are things I found or was told about since setting up the resources above, and are not in the course outline.
Data and Python files
Although it is not hard to download the MNIST data directly from within Python, it may be more convenient to get the data from the MNIST_data directory. The names of the files are mostly self-explanatory, but those names beginning t10k contain the test data.
This Python file contains code for loading the files in the MNIST_data directory, and then performing some of the tasks we spoke about during the second class, taken from Chapter 3 of Géron's "Hands-on" book. In order to run it successfully, you will need to install a number of Python libraries. If you are still using Python2 (not recommended), the tool for installing libraries is pip; if you have moved on to Python3, it is called pip3.
Then, from a terminal, issue the command
pip3 install --upgrade jupyter matplotlib numpy pandas scipy scikit-learn
If you like, you can add tensorflow to the list of modules to install. We will want it later.
It is advisable, although it is not necessary, to do all of the above in a virtual environment, in order not to mess up anything else you may be doing with Python on your computer. First install another package:
pip3 install --user --upgrade virtualenv
Then, create a directory (or folder, as some call it) where you wish to work, and, from within this directory, issue the command
virtualenv -p python3 venv
You can alter venv to any other name you like. Then, from the same place where you created the virtual environment, do
source venv/bin/activate
You are now in your virtual environment, and can proceed to install the various packages mentioned above. Once you are finished, you can just type deactivate, and you have left the virtual environment.
Log of material covered
The first class was later than usual, because it is scheduled on Monday, and September 14 was the first Monday of term on account of Labour Day. We began by looking at the Preface of Géron's book, and then went to cover much of his Chapter 1. We stopped just before the section entitled "Main Challenges of Machine Learning".
On September 21, we began by completing the study of the first chapter of Géron's book. This chapter provides a good overview of what machine learning is, and what the challenges are that face designers of algorithms intended to achieve machine learning.
We then skipped to the Deep Learning text, and embarked on Chapter 5, on "Machine Learning Basics". Much of this chapter duplicates what is in the first chapter of Géron, and so we skipped over it very lightly. We spent some time on subsection 5.3.1, on cross validation, in particular on the algorithm, presented in pseudo-code, for k-fold cross validation.
Since, except for Chapter 5, the first part of this book presents mathematical preliminaries, we have not spent time on it, and will not do so unless and until a need arises. We did however look briefly at section 3.1, entitled "Why Probability?", and took note of the different perspectives of frequentist probability and Bayesian probability, both of which are useful to us in different contexts. Then on to section 3.10, where we learned about the logistic and softplus functions. Next was section 3.13, on information theory, self-information, and the Kullback-Leibler divergence.
Finally, we returned to Géron's Chapter 2, in which he leads us through how to set up a machine-learning project. We just began this, and will continue with it next week.
Chapter 2 of the Hands-on book occupied us for all of the class on September 28. It deals in sometimes exhausting detail the many steps needed for a machine-learning project. So far, we got through the preliminary steps, some of them in numerous alternative versions.
These steps are, as enumerated at the beginning of the chapter:
What remains is:
Only the first of these should take much time next week.
We began on October 5 with material from Chapter 2 of the Deep Learning book, on the singular value decomposition, generalised inverses of matrices (called "pseudo-inverses" in the book), and in particular the Moore-Penrose inverse, and Principal Component Analysis (PCA). I used a note, available here, to help explain this material.
After that, we went back to the Hands-on book to finish off Chapter 2, with the end-to-end project on house prices. Chapter 3 deals with a classical problem for machine learning, recognising handwritten digits, using the MNIST data set. Unlike the housing problem, which is a regression task, this is a classification task, best treated with different algorithms. We defined precision and recall as two measures of how accurately a classification is done when there are only two categories, and defined a confusion matrix in a more general context.
After a long break, we began on October 19 by looking at sections 5.9 and 5.10 of the Deep Learning book, thereby concluding what I wanted to cover in the first part of the book. We then embarked on Part II, and, in Chapter 6, on Deep Feedforward Networks, covered section 6.1, on the impossibility of learning the XOR pattern by a linear function, and how to do so with one hidden layer and the ReLU (rectified linear unit) activation function. We made progress with section 6.2, on Gradient-Based Learning, and got as far as subsection 6.2.2.1. We will resume by discussing subsection 6.2.2.2, on Sigmoid Units.
In the Hands-On book, we finished Chapter 3 on the handwritten digits, and, in Chapter 4, looked again at Stochastic Gradient Descent. Géron gives a comparison of different methods of working with linear regressions. The next topic is Polynomial regression, which allows us to see more about over- and under-fitting.
Again on October 26 we started with the Deep Learning textbook, and completed the essential content of Chapter 6. We looked at a number of functions used as activation functions: sigmoid (or logistic), ReLU, softmax, softplus, hyperbolic tangent (tanh). We took a quick look at the universal approximation theorem, and then began our study of the back-propagation step in training a multilayer perceptron (MLP).
Géron gave us a few more paragraphs on back propagation. He then introduced us to the Fashion MNIST data set, which poses a classification task just like that with the handwritten digits. We were taken through a number of ways in which we could construct, compile, and run an MLP for the problem, using Keras. It turned out to be no harder to set up a model for a regression task, for which the California hausing data set was used.
Recordings
The recording of the first class, on September 14 can be viewed by clicking here. There is also an audio-only version, available here.
For September 21, the recording can be viewed by clicking here. Audio only is here.
For September 28, the recording can be viewed by clicking here. Audio only is here.
For October 5, the recording can be viewed by clicking here. Audio only is here.
For October 19, the recording can be viewed by clicking here. Audio only is here.
For October 26, the recording can be viewed by clicking here. Audio only is here.
To send me email, click here or write directly to
Russell.Davidson@mcgill.ca.
URL: https://russell-davidson.arts.mcgill.ca/e706