From 5e178476e69c2bdbf568367eb9c263b2b81c4e4f Mon Sep 17 00:00:00 2001 From: alvarotap <61787129+alvarotap@users.noreply.github.com> Date: Wed, 18 Mar 2020 13:36:00 +0100 Subject: [PATCH] Some minor typos in chapter 04 --- 04_mnist_basics.ipynb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/04_mnist_basics.ipynb b/04_mnist_basics.ipynb index 9a38334..b04e904 100644 --- a/04_mnist_basics.ipynb +++ b/04_mnist_basics.ipynb @@ -35,7 +35,7 @@ "\n", "To be exact, we'll discuss the role of arrays and tensors, and of broadcasting, a powerful technique for using them expressively. We'll explain stochastic gradient descent (SGD), the mechanism for learning by updating weights automatically. We'll discuss the choice of a loss function for our basic classification task, and the role of mini-batches. We'll also describe the math that a basic neural network is actually doing. Finally, we'll put all these pieces together to see them at work.\n", "\n", - "In future chapters we’ll do deep dives into other applications as well, and see how these concepts and tools generalize. But this chapter is about laying foundation stones. To be frank, that also makes this one of the harder chapters, because of how these concepts all depend on each other. Like an arch, all the stones need to be in place for the structure to stay up. Also like an arch, once that happens, it's a powerful structure that can support other things. But it requires some patience to assemble.\n", + "In future chapters we’ll do deep dives into other applications as well, and see how these concepts and tools generalize. But this chapter is about laying foundation stones. To be frank, that also makes this one of the hardest chapters, because of how these concepts all depend on each other. Like an arch, all the stones need to be in place for the structure to stay up. Also like an arch, once that happens, it's a powerful structure that can support other things. But it requires some patience to assemble.\n", "\n", "So let us begin. The first step is to consider how images are represented in a computer." ] @@ -2908,7 +2908,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To understand SGD, it might be easiest to start with a simple, synthetic, example. Let's imagine you were measuring the speed of a roller coaster as it went over the top of a hump. It would start fast, and then get slower as it went up the hill, and then would be slowest at the top, and it would then speed up again as it goes downhill. If you're measuring the speed manually every second for 20 seconds, it might look something like this:" + "To understand SGD, it might be easier to start with a simple, synthetic, example. Let's imagine you were measuring the speed of a roller coaster as it went over the top of a hump. It would start fast, and then get slower as it went up the hill, and then would be slowest at the top, and it would then speed up again as it goes downhill. If you're measuring the speed manually every second for 20 seconds, it might look something like this:" ] }, { @@ -3449,7 +3449,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We're using a new function, `torch.where(a,b,c)`. This is the same as running the list comprehension `[b[i] if a[i] else c[i] for i in range(len(a))]`, except it works on tensors, at C/CUDA speed. In plain English, this function will measure how distant each prediction is from 1 if it should be 1, and how distant it is from from 0 if it should be 0, and then it will take the mean of all those distances.\n", + "We're using a new function, `torch.where(a,b,c)`. This is the same as running the list comprehension `[b[i] if a[i] else c[i] for i in range(len(a))]`, except it works on tensors, at C/CUDA speed. In plain English, this function will measure how distant each prediction is from 1 if it should be 1, and how distant it is from 0 if it should be 0, and then it will take the mean of all those distances.\n", "\n", "> note: It's important to learn about PyTorch functions like this, because looping over tensors in Python performs at Python speed, not C/CUDA speed!\n", "\n", @@ -5378,7 +5378,7 @@ "|**gradient** | The derivative of the loss with respect to some parameter of the model\n", "|**backard pass** | Computing the gradients of the loss with respect to all model parameters\n", "|**gradient descent** | Taking a step in the directions opposite to the gradients to make the model parameters a little bit better\n", - "|**learning rate** | The size of the step we take when applying SGD to update the paramters of the model\n", + "|**learning rate** | The size of the step we take when applying SGD to update the parameters of the model\n", "|=====\n", "```" ]