Cristóbal Alcázar

How Gauss would compute a Confusion matrix for their classification model

· 2 min read

xkcd confusion matrix comic
Source: xkcd.com

A confusion matrix is a practical and conceptually simple tool to evaluate a classification model. So we need to honour it with a simple way to compute it, like Gauss in the past, without the magic of from sklearn.metrics import confusion_matrix would do it with simple linear algebra operations:

A confusion matrix is the matrix multiplication by the true and predicted labels, both encoding as one-hot vectors.

If we have the true labels of 4 observations in vector 𝐲=[1,0,2,1], and 3 different classes (i.e. 0, 1 and 2), their one-hot encoding will be:

𝐓=[010100001010]  [0,1]4 × 3

Some classification model gives us the predicted label for each observation in the vector 𝐲^=[2,0,2,0], by the same logic above, the one-hot encoding will be:

𝐓^=[001100001100]  [0,1]4 × 3
We have everything to compute the confusion matrix and, it will be 𝐓𝐓^  𝐙0+3×3. So again,

A confusion matrix is the matrix multiplication by the true and predicted labels, both encoding as one-hot vectors.

𝐓𝐓^=[100101001]  Z0+3 × 3parsing error: new line command not allowed in current environment ╭─► context: │ │…times~3} \\ │ ^^^^^^^^^^^ ╰────────────

As you notice, the confusion matrix summarizes the information correctly of both vectors.

𝐲=[1,0,2,1]parsing error: new line command not allowed in current environment ╭─► context: │ │…1,0,2,1] \\ │ \hat{\boldsy │ ^^^^^^^^^^^^ ╰─────────────𝐲^=[2,0,2,0]

Now with import numpy as np

We need two steps to compute our confusion matrix.

First, we need a way to transform a vector 𝐯 with k-classes into their one-hot-encoding version, v_one_hot = one_hot_econding(v):

def one_hot_encoding(v):
  '''Return the one-hot encoding vector for k-classes label vector'''
  num_classes = np.unique(v).size
  return np.eye(num_classes)[v]

Second, compute the confusion matrix,  𝐓𝐓^  𝐙0+K×K , for k-classes; there are many ways of doing it with numpy as you can see in the following code. Below I used the canonical notation to name the true labels (y) and the predicted ones (y_pred):

# 1st option: Using the matrix multiplication '@' operator
one_hot_encoding(y).T @ one_hot_encoding(y_pred)

# 2nd option: Using np.dot()
np.dot(one_hot_encoding(y).T, one_hot_encoding(y_pred))

# 3rd option: Using np.matmul()
np.matmul(one_hot_encoding(y).T, one_hot_encoding(y_pred))

And we are done! Of course, you can always get your confusion matrix from your favourite store ;)

from sklearn.metrics import confusion_matrix
confusion_matrix(y, y_pred)



                         That's the way computer talks to each other.