Notes on “Mathematics for Machine Learning”

Machine Learning


I’ve started taking more MOOCs again over the past 18 months, but haven’t been able to get in the habit of taking consistent notes. Today I started a new one on Coursera, Mathematics of Machine Learning offered by Imperial College London, which is a tad outside of my typical wheelhouse of code-centric ML courses. This seemed like a good opportunity to try to build the habit up.

Course #1: Linear Algebra


Vector-on-vector addition and subtraction is straightforward, as are scalar-on-vector operations. In general, we’re doing pairwise operations (i.e., combining components in one vector with elements at the same position in the other).

Thank you, years of R, pandas, numpy, etc. for making these basic ones second nature. ✌

The size, length, or magnitude of a vector is denoted as and can be calculated using the Pythagorean theorem: we should take the square root of the sum of the squares of the components.

Dot products are similar with about the same definition, just without the square root:

It’s important to remember that both of these operations return scalar values.

Scalar projects are the size of the “shadow” of a vector onto another vector, if we imagine the sun shining down perpendicular to the second vector. Vector projections are that shadow vector. As the names imply, scalar projects are scalars and vector projections are vectors.

Fig. 1: Vector s projected onto another vector r

The scalar projection of onto is calculated as while the vector projection is calculated as .

We can tell if two vectors are orthogonal if their dot product is zero.

Basis vectors are the “reference points” we use to describe other vectors. They can be unit vectors or something else, but they should be orthogonal. We can calculate new points for a vector under new basis vectors by calculating the vector’s projection on each basis vector.

To recalculate using basis vectors and :

Course #2: Multivariate Calculus

Course #3: PCA