From b92b5f7faf4d034af8c6db09726da71ffc55c51a Mon Sep 17 00:00:00 2001 From: Jonathan Sum <777Jonathansum@gmail.com> Date: Sat, 11 Apr 2020 09:15:11 -0700 Subject: [PATCH] Update 05_pet_breeds.ipynb Changing from "as you might want your model to sometimes tell you it doesn't recognize any of the classes is has seen during training, and not pick a class because it has a slightly bigger activation score." to "as you might want your model to sometimes tell you it doesn't recognize any of the classes that it has seen during training, and not pick a class because it has a slightly bigger activation score." --- 05_pet_breeds.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/05_pet_breeds.ipynb b/05_pet_breeds.ipynb index 24baf9c..3e5bcdd 100644 --- a/05_pet_breeds.ipynb +++ b/05_pet_breeds.ipynb @@ -971,7 +971,7 @@ "source": [ "What does this function do in practice? Taking the exponential ensures all our numbers are positive, and then dividing by the sum ensures we are going to have a bunch of numbers that add up to one. The exponential also has a nice property: if one of the numbers in our activations `x` is slightly bigger than the others, the exponential will amplify this (since it grows, well... exponentially) which means that in the softmax, that number will be closer to 1. \n", "\n", - "Intuitively, the Softmax function *really* wants to pick one class among the others, so it's ideal for training a classifier when we know each picture has a definite label. (Note that it may be less ideal during inference, as you might want your model to sometimes tell you it doesn't recognize any of the classes is has seen during training, and not pick a class because it has a slightly bigger activation score. In this case, it might be better to train a model using multiple binary output columns, each using a sigmoid activation.)\n", + "Intuitively, the Softmax function *really* wants to pick one class among the others, so it's ideal for training a classifier when we know each picture has a definite label. (Note that it may be less ideal during inference, as you might want your model to sometimes tell you it doesn't recognize any of the classes that it has seen during training, and not pick a class because it has a slightly bigger activation score. In this case, it might be better to train a model using multiple binary output columns, each using a sigmoid activation.)\n", "\n", "Softmax is the first part of the cross entropy loss, the second part is log likeklihood. " ]