diff --git a/04_mnist_basics.ipynb b/04_mnist_basics.ipynb index 108acc1..167e4b2 100644 --- a/04_mnist_basics.ipynb +++ b/04_mnist_basics.ipynb @@ -60,7 +60,7 @@ "\n", "Geoff Hinton has told of how even academic papers showing dramatically better results than anything previously published would be rejected from top journals and conferences, just because they used a neural network. Yann Lecun's work on convolutional neural networks, which we will study in the next section, showed that these models could read hand-written text--something that had never been achieved before. However his breakthrough was ignored by most researchers, even as it was used commercially to read 10% of the checks in the US!\n", "\n", - "In addition to these three Turing Award winners, there are many other researchers who have battled to get us to where we are today. For instance, Jurgen Schmidhuber (who many believe should have shared in the Turing Award) pioneered many important ideas, including working on the *LSTM* architecture with his student Sepp Hochreiter (widely used for speech recognition and other text modeling tasks, and used in the IMDb example in <>). Perhaps most important of all, Werbos invented back-propagation for neural networks, the technique shown in this chapter and used universally for training neural networks. His development was almost entirely ignored for decades, but today it is the most important foundation of modern AI.\n", + "In addition to these three Turing Award winners, there are many other researchers who have battled to get us to where we are today. For instance, Jurgen Schmidhuber (who many believe should have shared in the Turing Award) pioneered many important ideas, including working on the *LSTM* architecture with his student Sepp Hochreiter (widely used for speech recognition and other text modeling tasks, and used in the IMDb example in <>). Perhaps most important of all, Paul Werbos in 1974 invented back-propagation for neural networks, the technique shown in this chapter and used universally for training neural networks ([Werbos 1994](https://books.google.com/books/about/The_Roots_of_Backpropagation.html?id=WdR3OOM2gBwC)). His development was almost entirely ignored for decades, but today it is the most important foundation of modern AI.\n", "\n", "There is a lesson here for all of us! On your deep learning journey you will face many obstacles, both technical, and (even more difficult) people around you who don't believe you'll be successful. There's one *guaranteed* way to fail, and that's to stop trying. We've seen that the only consistent trait amongst every fast.ai student that's gone on to be a world-class practitioner is that they are all very tenacious." ] @@ -129,7 +129,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The MNIST dataset shows is a very common layout for machine learning datasets: separate folders for the *training set*, which is used to train a model, and the *validation set* (and/or *test set*), which is used to evaluate the model (we'll be talking a lot of these concepts very soon!) Let's see what's inside the training set:" + "The MNIST dataset shows a very common layout for machine learning datasets: separate folders for the *training set*, which is used to train a model, and the *validation set* (and/or *test set*), which is used to evaluate the model (we'll be talking a lot of these concepts very soon!) Let's see what's inside the training set:" ] }, { @@ -1346,7 +1346,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can see that the background white pixels are stored as the number zero, black is the number 255, and shades of grey are between the two. This image contains 28 pixels across and 28 pixels down, for a total of 768 pixels. (This is much smaller than an image that you would get from a phone camera, which has millions of pixels, but is a convenient size for our initial learning and experiments. We will build up to bigger, full-colour images soon.)\n", + "You can see that the background white pixels are stored as the number zero, black is the number 255, and shades of grey are between the two. The entire image contains 28 pixels across and 28 pixels down, for a total of 768 pixels. (This is much smaller than an image that you would get from a phone camera, which has millions of pixels, but is a convenient size for our initial learning and experiments. We will build up to bigger, full-colour images soon.)\n", "\n", "So, now you've seen what an image looks like to a computer, let's recall our goal: create a model that can recognise “3”s and “7”s. How might you go about getting a computer to do that?\n", "\n", @@ -1380,7 +1380,9 @@ "source": [ "Step one for our simple model is to get the average of pixel values for each of our two groups. In the process of doing this, we will learn a lot of neat Python numeric programming tricks!\n", "\n", - "Let's create a tensor containing all of our threes stacked together. We already know how to create a tensor containing a single image. To create a tensor for every image in a directory, we can use a list comprehension. (Notice also that we use Jupyter to do some little checks of our work along the way; in this case, making sure that the number of returned items seems reasonable):" + "Let's create a tensor containing all of our threes stacked together. We already know how to create a tensor containing a single image. To create a tensor containing all the images in a directory, we will first use a Python list comprehension to create a plain list of the single image tensors.\n", + "\n", + "We will use Jupyter to do some little checks of our work along the way -- in this case, making sure that the number of returned items seems reasonable:" ] }, { @@ -1445,7 +1447,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We want to take the average of the pixels, which means we need to combine all of the items of this list into a single three-dimensional tensor. The most common way to describe such a tensor is to call it a *rank-3 tensor*. We often need to stack up individual tensors in a collection into a single tensor. Unsurprisingly, PyTorch comes with a function called `stack`. Some things in PyTorch, such as taking a mean, require us to cast our integer types to float types. Since we'll be needing this later, we'll cast our tensor to `float` now. Casting in PyTorch is as simple as typing the name of the type you wish to cast to, and treating it as a method. Generally when images are floats, the pixels are expected to be be zero and one, so we also divide by 255 here." + "For every pixel position, we want to compute the average over all the images of the intensity of that pixel. To do this we first combine all the images in this list into a single three-dimensional tensor. The most common way to describe such a tensor is to call it a *rank-3 tensor*. We often need to stack up individual tensors in a collection into a single tensor. Unsurprisingly, PyTorch comes with a function called `stack`.\n", + "\n", + "Some operations in PyTorch, such as taking a mean, require us to cast our integer types to float types. Since we'll be needing this later, we'll also cast our stcked tensor to `float` now. Casting in PyTorch is as simple as typing the name of the type you wish to cast to, and treating it as a method.\n", + "\n", + "Generally when images are floats, the pixels are expected to be be zero and one, so we will also divide by 255 here." ] }, { @@ -1474,7 +1480,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Perhaps the most important attribute of tensor is its shape. This tells you the length of each axis. In this case, we can see that we have 6131 images, each of size 28 x 28 pixels. There is nothing specifically about this tensor that says that the first axis is the number of images, the second is the height, and the third is the width — the semantics of a tensor are entirely up to us, and how we construct it. As far as PyTorch is concerned, it is just a bunch of numbers in memory.\n", + "Perhaps the most important attribute of a tensor is its shape. This tells you the length of each axis. In this case, we can see that we have 6131 images, each of size 28 x 28 pixels. There is nothing specifically about this tensor that says that the first axis is the number of images, the second is the height, and the third is the width — the semantics of a tensor are entirely up to us, and how we construct it. As far as PyTorch is concerned, it is just a bunch of numbers in memory.\n", "\n", "The length of a tensor's shape is its rank." ] @@ -1537,7 +1543,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we can calculate the mean across all of these tensors by taking the mean of dimension zero." + "Finally, we can compute what the ideal three looks like. We calculate the mean of all the image tensors, by taking the mean along dimension zero of our stacked, rank-3 tensor. This is the dimension which indexes over all the images.\n", + "\n", + "In other words, for every pixel position, this will compute the average of that pixel over all images. So the result will be one value for every pixel position -- in other words, a single image. Here it is:" ] }, { @@ -1633,7 +1641,9 @@ "source": [ "We can't just add up the differences between the pixels of this image and the ideal digit. Why not?...\n", "\n", - "Because some will be too high, some will be too low, but overall these will balance out. Instead, there's two main ways data scientists measure *distance* in this context:\n", + "Because some differences will be positive, some will be negative, and these differences cancel out, resulting in a situation where an image which is too dark in some places and too light in others might be shown as having zero total differences from the ideal. That would be misleading!\n", + "\n", + "To avoid this, there's two main ways data scientists measure *distance* in this context:\n", "\n", "- Take the mean of the *absolute value* of differences (_absolute value_ is the function that replaces negative values with positive values). This is called the *mean absolute difference* or *L1 norm*\n", "- Take the mean of the *square* of differences (which makes everything positive) and then take the *square root* (which *undoes* the squaring). This is called the *root mean squared error (RMSE)* or *L2 norm*.\n", @@ -2442,7 +2452,7 @@ "\n", "There are many different ways to do each of these seven steps, and we will be learning about them throughout the rest of this book. These are the details which make a big difference for deep learning practitioners. But it turns out that the general approach to each one generally follows some basic principles:\n", "\n", - "- **Initialize**: we initialise the weights to random values. This may sound surprising. There are certainly other choices we could make, such as initialising them to the percentage of times that that pixel is activated for that category. But since we already know that we have a routine to improve these weights, it turns out that just starting with random weights works perfectly well\n", + "- **Initialize**: we initialise the parameters to random values. This may sound surprising. There are certainly other choices we could make, such as initialising them to the percentage of times that that pixel is activated for that category. But since we already know that we have a routine to improve these weights, it turns out that just starting with random weights works perfectly well\n", "- **Loss**: This is the thing Arthur Samuel refered to: \"*testing the effectiveness of any current weight assignment in terms of actual performance*\". We need some function that will return a number that is small if the performance of the model is good, and vice versa (the standard approach is to treat a small loss as good, and a large loss as bad, although this is just a convention)\n", "- **Step**: A simple way to figure out whether a weight should be increased a bit, or decreased a bit, would be just to try it. Increase the weight by a small amount, and see if the loss goes up or down. Once you find the correct direction, you could then change that amount by a bit more, and a bit less, until you find an amount which works well. However, this is slow! As we will see, the magic of calculus allows us to directly figure out which direction, and roughly how much, to change each weight, without having to try all these small changes, by calculating *gradients*. This is just a performance optimisation, we would get exactly the same results by using the slower manual process as well\n", "- **Stop**: We have already discussed how to choose how many epochs to train a model for. This is where that decision is applied. For our digit classifier, we would keep training until the accuracy of the model started getting worse, or we ran out of time." @@ -2562,7 +2572,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## The gradient" + "### The gradient" ] }, { @@ -2766,14 +2776,562 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## The loss function" + "## Stepping with a learning rate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "As we've seen, if we are going to calculate gradients (which we need), then we need some *loss function* that represents how good our model is. The obvious approach would be to use the accuracy for this purpose. In this case, we would calculate our prediction for each image, and then calculate the overall accuracy (remember, at first we simply use random weights), and then calculate the gradients of each weight with respect to that accuracy calculation.\n", + "The gradient only tells us the slope of our function, it doesn't actually tell us how far to adjust the parameters. It gives us some idea of how far to adjust them; if the slope is very large, then that may suggest that we have more adjustments to do, whereas if the slope is very small, that may suggest that we are close to the optimal value.\n", + "\n", + "Deciding how to change our parameters based on the value of the gradients is an important part of the deep learning process. Nearly all approaches start with the basic idea of multiplying the gradient by some small number, called the *learning rate* (LR). The learning rate is often a number between 0.001 and 0.1, although it could be anything. Often, people select a learning rate just by trying a few, and finding which results in the best model after training (we'll show you a better approach later in this book, called the *learning rate finder*). Once you've picked a learning rate, you can adjust your parameters using this simple function:\n", + "\n", + "```\n", + "w -= gradient(w) * lr\n", + "```\n", + "\n", + "This is known as *stepping* your parameters, using a *optimiser step*.\n", + "\n", + "If you pick a learning rate that's too low, it can mean having to do for a lot of steps:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"An" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Although picking a learning rate that's too high is even worse--it can actually result in the loss getting *worse*!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"An" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the learning rate is too high, it may also \"bounce\" around, rather than actually diverging; this has the result of taking many steps to train successfully:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\"An" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### An end-to-end SGD example" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To understand SGD, it might be easiest to start with a simple, synthetic, example. Let's imagine you were measuring the speed of a roller coaster as it went over the top of a hump. It would start fast, and then get slower as it went up the hill, and then would be slowest at the top, and it would then speed up again as it goes downhill. If you're measuring the speed manually every second for 20 seconds, it might look something like this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19.])" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "time = torch.arange(0,20).float(); time" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXMAAAD7CAYAAACYLnSTAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAWy0lEQVR4nO3dfYxcV3nH8e8vtpWsbC+u48XFW9luDLGpExw3i4KIAkhJa0FLcWMkTFIISMhAlKqo1IK0OLh5UQBD/yiEF0sp5LUNhrUhpGA1SlJICikbXMdaYVt1UkPWKawhXrz2OjHu0z/mTjKezM7c8cydlzu/jzSS59wzd56czD5z5txzz1FEYGZm3e2sdgdgZmaNczI3M8sBJ3MzsxxwMjczywEnczOzHJjZjjddsGBBLF26tB1vbWbWtZ544onDETFQ6VhbkvnSpUsZGRlpx1ubmXUtSQenO+ZhFjOzHHAyNzPLASdzM7MccDI3M8sBJ3Mzsxxoy2yWM7Vj1xhbdu7j0JEpFs3rY+Oa5axdPdjusMzM2q5rkvmOXWNcP7yHqZOnABg7MsX1w3sAnNDNrOd1zTDLlp37XkzkRVMnT7Fl5742RWRm1jm6JpkfOjJVV7mZWS/pmmS+aF5fXeVmZr2ka5L5xjXL6Zs147Syvlkz2LhmeZsiMjPrHF1zAbR4kdOzWczMXq5rkjkUErqTt5nZy3XNMIuZmU3PydzMLAeczM3McsDJ3MwsB2omc0mTZY9Tkj5fcvxySXslHZf0sKQl2YZsZmblaibziJhTfAALgSlgG4CkBcAwsAmYD4wA92UXrpmZVVLvMMs7gV8CP0ieXwmMRsS2iDgBbAZWSVrRvBDNzKyWepP5NcCdERHJ85XA7uLBiDgGHEjKTyNpg6QRSSPj4+NnGq+ZmVWQOplLWgy8GbijpHgOMFFWdQKYW/76iNgaEUMRMTQwMHAmsZqZ2TTq6Zm/F3g0Ip4uKZsE+svq9QNHGw3MzMzSqzeZ31FWNgqsKj6RNBtYlpSbmVmLpErmkt4IDJLMYimxHbhA0jpJ5wA3AE9GxN7mhmlmZtWkXWjrGmA4Ik4bPomIcUnrgC8AdwOPA+ubG6KZWffLeg/jVMk8Ij5Y5diDgKcimplNoxV7GPt2fjOzjLViD2MnczOzjLViD2MnczOzjLViD2MnczOzjLViD+Ou2jbOzKwbtWIPYydzM7MWyHoPYw+zmJnlgJO5mVkOOJmbmeWAk7mZWQ44mZuZ5YCTuZlZDjiZm5nlgJO5mVkOOJmbmeWAk7mZWQ6kTuaS1kv6qaRjkg5Iuiwpv1zSXknHJT0saUl24ZqZWSWp1maR9EfAp4F3Af8JvCopXwAMAx8A7gduAu4D3pBFsI3KetsmM7N2SbvQ1t8DN0bEj5LnYwCSNgCjEbEteb4ZOCxpRadt6tyKbZvMzNql5jCLpBnAEDAg6b8lPSPpC5L6gJXA7mLdiDgGHEjKy8+zQdKIpJHx8fHm/Rek1Iptm8zM2iXNmPlCYBbwTuAy4CJgNfAJYA4wUVZ/AphbfpKI2BoRQxExNDAw0FDQZ6IV2zaZmbVLmmRezHafj4hnI+Iw8A/A24BJoL+sfj9wtHkhNkcrtm0yM2uXmsk8Ip4DngGiwuFRYFXxiaTZwLKkvKO0YtsmM7N2SXsB9KvAX0r6HnAS+AjwHWA7sEXSOuAB4AbgyU67+Amt2bbJzPKr02fDpU3mNwELgP3ACeDrwC0RcSJJ5F8A7gYeB9ZnEWgzZL1tk5nlUzfMhkuVzCPiJHBt8ig/9iCwoslxmZl1jGqz4Tolmft2fjOzGrphNpyTuZlZDd0wG87J3Myshm6YDZf2AqiZWc/qhtlwTuZmZil0+mw4D7OYmeWAk7mZWQ44mZuZ5YCTuZlZDjiZm5nlgJO5mVkOOJmbmeWAk7mZWQ44mZuZ5YCTuZlZDjiZm5nlQKpkLukRSSckTSaPfSXHrpJ0UNIxSTskzc8uXDMzq6Senvl1ETEneSwHkLQS+ArwHmAhcBz4YvPDNDOzahpdNfFq4P6I+D6ApE3ATyXNjYijDUdnZmap1NMzv1XSYUmPSXpLUrYS2F2sEBEHgBeA88tfLGmDpBFJI+Pj443EbGZmZdIm848B5wGDwFbgfknLgDnARFndCWBu+QkiYmtEDEXE0MDAQAMhm5lZuVTJPCIej4ijEfF8RNwBPAa8DZgE+suq9wMeYjEza6EznZoYgIBRYFWxUNJ5wNnA/sZDMzOztGpeAJU0D7gE+Hfgt8C7gDcBH0le/0NJlwE/AW4Ehn3x08ystdLMZpkF3AysAE4Be4G1EbEPQNKHgHuAc4EHgfdnE6qZmU2nZjKPiHHg9VWO3wvc28ygzMysPr6d38wsBxq9aain7Ng1xpad+zh0ZIpF8/rYuGY5a1cPtjssMzMn87R27Brj+uE9TJ08BcDYkSmuH94D4IRuZm3nYZaUtuzc92IiL5o6eYotO/dN8wozs9ZxMk/p0JGpusrNzFrJyTylRfP66io3M2slJ/OUNq5ZTt+sGaeV9c2awcY1y9sUkZnZS3wBNKXiRU7PZjGzTuRkXoe1qwedvM2sI3mYxcwsB5zMzcxywMMsZtYT8n4Ht5O5meVeL9zB7WEWM8u9XriD28nczHKvF+7gdjI3s9zrhTu4nczNLPd64Q7uupK5pNdIOiHp7pKyqyQdlHRM0g5J85sfppnZmVu7epBbr7yQwXl9CBic18etV16Ym4ufUP9sltuAHxefSFoJfAX4EwobOm8Fvgisb1aAZmbNkPc7uFMnc0nrgSPAfwCvToqvBu6PiO8ndTYBP5U0NyKONjtYMzOrLNUwi6R+4Ebgo2WHVgK7i08i4gDwAnB+hXNskDQiaWR8fPzMIzYzs5dJO2Z+E3B7RPy8rHwOMFFWNgHMLT9BRGyNiKGIGBoYGKg/UjMzm1bNYRZJFwFXAKsrHJ4E+svK+gEPsZiZtVCaMfO3AEuBn0mCQm98hqQ/AL4HrCpWlHQecDawv9mBmpnZ9NIk863Av5Q8/xsKyf3DwCuBH0q6jMJslhuBYV/8NDNrrZrJPCKOA8eLzyVNAiciYhwYl/Qh4B7gXOBB4P0ZxWpmZtOoe9XEiNhc9vxe4N5mBWRmZvXz7fxmZjngZG5mlgNO5mZmOeBkbmaWA07mZmY54GRuZpYDTuZmZjngZG5mlgNO5mZmOeBkbmaWA07mZmY5UPfaLGZm7bBj1xhbdu7j0JEpFs3rY+Oa5bne07NeTuZm1vF27Brj+uE9TJ08BcDYkSmuH94D4ISe8DCLmXW8LTv3vZjIi6ZOnmLLzn1tiqjzOJmbWcc7dGSqrvJe5GRuZh1v0by+usp7UapkLuluSc9K+o2k/ZI+UHLsckl7JR2X9LCkJdmFa2a9aOOa5fTNmnFaWd+sGWxcs7xNEXWetD3zW4GlEdEP/Blws6SLJS0AhoFNwHxgBLgvk0jNrGetXT3IrVdeyOC8PgQMzuvj1isv9MXPEqlms0TEaOnT5LEMuBgYjYhtAJI2A4clrYiIvU2O1cx62NrVg07eVaQeM5f0RUnHgb3As8C/AiuB3cU6EXEMOJCUl79+g6QRSSPj4+MNB25mZi9Jncwj4lpgLnAZhaGV54E5wERZ1YmkXvnrt0bEUEQMDQwMnHnEZmb2MnXNZomIUxHxKPB7wIeBSaC/rFo/cLQ54ZmZWRpnOjVxJoUx81FgVbFQ0uyScjMza5GayVzSKyWtlzRH0gxJa4B3Aw8B24ELJK2TdA5wA/CkL36ambVWmp55UBhSeQZ4Dvgs8JGI+FZEjAPrgFuSY5cA6zOK1czMplFzamKSsN9c5fiDwIpmBpVXXvXNzLLiVRNbxKu+Wa9zZyZbXpulRbzqm/WyYmdm7MgUwUudmR27xtodWm44mbeIV32zXubOTPaczFvEq75ZL3NnJntO5i3iVd+sl7kzkz0n8xbxqm/Wy9yZyZ5ns7SQV32zXlX83Hs2S3aczM2sJdyZyZaHWczMcsDJ3MwsB5zMzcxywMnczCwHfAG0i3htCzObjpN5l/BCXWZWjYdZuoTXtjCzapzMu4TXtjCzatJsG3e2pNslHZR0VNIuSW8tOX65pL2Sjkt6WNKSbEPuTV7bwsyqSdMznwn8nMJuQ68ANgFfl7RU0gJgOCmbD4wA92UUa0/z2hZmVk2abeOOAZtLir4j6WngYuBcYDQitgFI2gwclrTCmzo3VzPWtvBsGLP8qns2i6SFwPnAKIWNnncXj0XEMUkHgJXA3rLXbQA2ACxevLiBkHtXI2tbeDaMWb7VdQFU0izgHuCOpOc9B5goqzYBzC1/bURsjYihiBgaGBg403jtDHk2jFm+pU7mks4C7gJeAK5LiieB/rKq/cDRpkRnTePZMGb5liqZSxJwO7AQWBcRJ5NDo8CqknqzgWVJuXUQz4Yxy7e0PfMvAa8F3h4RpV257cAFktZJOge4AXjSFz87j2fDmOVbmnnmS4APAhcB/ytpMnlcHRHjwDrgFuA54BJgfZYB25nxtnVm+aaIaPmbDg0NxcjISMvf18ysm0l6IiKGKh3z7fxmZjngZG5mlgNeAtfMUvEdxJ3NydzMavIdxJ3PwyxmVpPvIO58TuZmVpPvIO58TuZmVpPvIO58TuZmVpPvIO58vgBqZjU1Yz19y5aTuZml0sh6+pY9J3NLzfOMzTqXk7ml4nnGZp3NF0AtFc8zNutsTuaWiucZm3U2D7NYKovm9TFWIXHXM8/YY+5m2XHP3FJpdJ5xccx97MgUwUtj7jt2jWUQrVnvSbsH6HWSRiQ9L+lrZccul7RX0nFJDyc7E1nONLpTkcfc22/HrjEu/dRD/P7HH+DSTz3kL9KcSTvMcgi4GVgDvPi7WtICYBj4AHA/cBNwH/CG5oZpnaCRecYec28vz0bKv1Q984gYjogdwK/KDl0JjEbEtog4AWwGVkla0dwwrdt5bY/28i+j/Gt0zHwlsLv4JCKOAQeS8tNI2pAM1YyMj483+LbWbby2R3v5l1H+NZrM5wATZWUTwNzyihGxNSKGImJoYGCgwbe1btPomLs1xr+M8q/RqYmTQH9ZWT9wtMHzWg55bY/22bhm+Wlj5uBfRnnTaM98FFhVfCJpNrAsKTezDuFfRvmXqmcuaWZSdwYwQ9I5wG+B7cAWSeuAB4AbgCcjYm9G8ZrZGfIvo3xL2zP/BDAFfBz4i+Tfn4iIcWAdcAvwHHAJsD6DOM3MrIpUPfOI2Exh2mGlYw8CnopoZtZGvp3fzCwHnMzNzHLAydzMLAe8BK5Zl/ASwlaNk7lZF/BCWVaLh1nMuoAXyrJanMzNuoAXyrJaPMxiXaOXx4ybsW2f5Zt75tYV8rDtXCM7/XgJYavFydy6QrePGTf6ZeSFsqwWD7NYV+j2MeNqX0ZpE7IXyrJq3DO3rtDtmyt0+5eRdT4nc+sK3T5m3O1fRtb5nMytK3T7mHG3fxlZ5/OYuXWNbh4zLsbdq1MrLXtO5mYt0s1fRtb5mjLMImm+pO2Sjkk6KOmqZpzXzMzSaVbP/DbgBWAhcBHwgKTdEeGNnS03evkOVOt8DffMJc2msA/opoiYjIhHgW8D72n03GadIg93oFq+NWOY5XzgVETsLynbDaxswrnNmqaR2+m7/Q5Uy79mDLPMASbKyiaAuaUFkjYAGwAWL17chLc1S6/R9cB90491umb0zCeB/rKyfuBoaUFEbI2IoYgYGhgYaMLbmqXXaM/aN/1Yp2tGMt8PzJT0mpKyVYAvflrHaLRn7Zt+rNM1nMwj4hgwDNwoabakS4F3AHc1em6zZmm0Z93td6Ba/jVrauK1wD8BvwR+BXzY0xKtk2xcs/y0MXOov2ftm36skzUlmUfEr4G1zTiXWRZ8O73lnW/nt57hnrXlmVdNNDPLASdzM7MccDI3M8sBJ3MzsxxwMjczywFFROvfVBoHDjZwigXA4SaFkwXH1xjH1xjH15hOjm9JRFRcD6UtybxRkkYiYqjdcUzH8TXG8TXG8TWm0+ObjodZzMxywMnczCwHujWZb213ADU4vsY4vsY4vsZ0enwVdeWYuZmZna5be+ZmZlbCydzMLAeczM3McqAjk7mk+ZK2Szom6aCkq6apJ0mflvSr5PEZSco4trMl3Z7EdVTSLklvnabu+ySdkjRZ8nhLlvEl7/uIpBMl71lxo8s2td9k2eOUpM9PU7cl7SfpOkkjkp6X9LWyY5dL2ivpuKSHJS2pcp6lSZ3jyWuuyDI+SW+Q9G+Sfi1pXNI2Sa+qcp5Un4smxrdUUpT9/9tU5Tytbr+ry2I7nsR78TTnyaT9mqUjkzlwG/ACsBC4GviSpJUV6m2gsCnGKuB1wJ8CH8w4tpnAz4E3A68ANgFfl7R0mvo/jIg5JY9HMo6v6LqS95xuO52Wt19pW1D4/zsFbKvykla03yHgZgq7Zb1I0gIKWyJuAuYDI8B9Vc7zz8Au4Fzg74BvSGrG7uUV4wN+h8LMi6XAEgqbqH+1xrnSfC6aFV/RvJL3vKnKeVrafhFxT9nn8VrgKeAnVc6VRfs1Rcclc0mzgXXApoiYjIhHgW8D76lQ/RrgcxHxTESMAZ8D3pdlfBFxLCI2R8T/RMT/RcR3gKeBit/mHa7l7VfmnRS2GvxBC9/zZSJiOCJ2UNjysNSVwGhEbIuIE8BmYJWkFeXnkHQ+8IfAJyNiKiK+Ceyh8FnOJL6I+G4S228i4jjwBeDSRt+vWfHVox3tV8E1wJ3RpVP8Oi6ZA+cDpyJif0nZbqBSz3xlcqxWvcxIWkgh5un2PF0t6bCk/ZI2SWrV7k63Ju/7WJWhiXa3X5o/nna1H5S1T7J5+QGm/yw+FRFHS8pa3Z5vYvrPYVGaz0WzHZT0jKSvJr92Kmlr+yXDZ28C7qxRtR3tl0onJvM5wERZ2QQwN0XdCWBO1uO+RZJmAfcAd0TE3gpVvg9cALySQg/j3cDGFoT2MeA8YJDCz/D7JS2rUK9t7SdpMYWhqjuqVGtX+xU18lmsVrfpJL0OuIHq7ZP2c9Esh4HXUxgCuphCW9wzTd22th/wXuAHEfF0lTqtbr+6dGIynwT6y8r6KYwH1qrbD0y24meSpLOAuyiM7V9XqU5EPBURTyfDMXuAGykMLWQqIh6PiKMR8XxE3AE8BrytQtW2tR+FP55Hq/3xtKv9SjTyWaxWt6kkvRr4LvBXETHtkFUdn4umSIZJRyLitxHxCwp/J38sqbydoI3tl3gv1TsWLW+/enViMt8PzJT0mpKyVVT++TiaHKtVr6mSnuvtFC7grYuIkylfGkBLfjWkfN+2tF+i5h9PBa1uv9PaJ7mes4zpP4vnSSrtSWbensnwwIPATRFxV50vb3V7FjsJ030WW95+AJIuBRYB36jzpe36e66o45J5Mi45DNwoaXbS0O+g0Asudyfw15IGJS0CPgp8rQVhfgl4LfD2iJiarpKktyZj6iQXzTYB38oyMEnzJK2RdI6kmZKupjAWuLNC9ba0n6Q3UvipWm0WS8vaL2mnc4AZwIxi2wHbgQskrUuO3wA8WWlILbnG81/AJ5PX/zmFGULfzCo+SYPAQ8BtEfHlGueo53PRrPgukbRc0lmSzgX+EXgkIsqHU9rSfiVVrgG+WTZeX36OzNqvaSKi4x4UpoHtAI4BPwOuSsovozAMUKwn4DPAr5PHZ0jWm8kwtiUUvpFPUPhpWHxcDSxO/r04qftZ4BfJf8dTFIYJZmUc3wDwYwo/T48APwL+qFPaL3nfrwB3VShvS/tRmKUSZY/NybErgL0UplA+Aiwted2XgS+XPF+a1JkC9gFXZBkf8Mnk36Wfw9L/v38LfLfW5yLD+N5NYabXMeBZCp2H3+2U9kuOnZO0x+UVXteS9mvWwwttmZnlQMcNs5iZWf2czM3McsDJ3MwsB5zMzcxywMnczCwHnMzNzHLAydzMLAeczM3McuD/AdndnL7Vn+NhAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "speed = torch.randn(20)*3 + 0.75*(time-9.5)**2 + 1\n", + "plt.scatter(time,speed);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've added a bit of random noise, since measuring things manually isn't precise. This means it's not that easy to answer the question: what was the roller coaster's lowest speed? Using SGD we can try to find a function that matches our observations. We can't consider every possible function, so let's use a guess that it will be quadratic, i.e. a function of the form `a*(time**2)+(b*time)+c`.\n", + "\n", + "We want to distinguish clearly between the function's input (the time when we are measuring the coaster's speed) and its parameters (the values that define *which* quadratic we're trying). So let us collect the parameters in one argument and separate the input, `t`, and the parameters, `params` in the function's signature: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def f(t, params):\n", + " a,b,c = params\n", + " return a*(t**2) + (b*t) + c" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In other words, we've restricted the problem of finding the best imaginable function that fits the data, to finding the best *quadratic* function. This greatly simplifies the problem, since every quadratic function is fully defined by the three parameters `a`, `b`, and `c`. So to find the best quadratic function, we only need to find the best values for `a`, `b`, and `c`.\n", + "\n", + "If we can solve this problem for the three parameters of a quadratic function, we'll be able to apply the same approach for other, more complex functions with more parameters--such as a neural net. So let's find the parameters for `f` first, and then we'll come back and do the same thing for the MNIST dataset with a neural net.\n", + "\n", + "We need to define first what we mean by \"best\". We can define this precisely by choosing a *loss function*, which will return a value based on a prediction and a target, where lower values of the function correspond to \"better\" predictions. For continuous data, it's common to use *mean squared error*:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def mse(preds, targets): return ((preds-targets)**2).mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's work through our 7 step process.\n", + "\n", + "Step 1--*Initialize* the parameters to random values, and tell PyTorch that we want to track their gradients, using `requires_grad_`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "params = torch.randn(3).requires_grad_()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#hide\n", + "orig_params = params.clone()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Step 2--Calculate the *predictions*:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "preds = f(time, params)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's create a little function to see how close our predictions are to our targets, and take a look:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def show_preds(preds, ax=None):\n", + " if ax is None: ax=plt.subplots()[1]\n", + " ax.scatter(time, speed)\n", + " ax.scatter(time, to_np(preds), color='red')\n", + " ax.set_ylim(-300,100)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEACAYAAABLfPrqAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAc1ElEQVR4nO3dfbBU9Z3n8fcHsYAAd8B4hyxMAaurMHV1iOtNOZuYxIxuTDKbLUf2D81NHGY2koxFaqusMjErRCtKaeLU/pHZyQOWBHWZncQUWuM86M5UxESzk9pmLExuBHeJXBKfckkQuYD4wHf/OKe1bbvvvX3POd2nuz+vqi66z+/Xp78cDv3t83s6igjMzMxaMavTAZiZWfdx8jAzs5Y5eZiZWcucPMzMrGVOHmZm1jInDzMza5mTh5mZtSzX5CFpg6SKpBOSttWVXSxpj6Rjkh6WtKKmbI6krZJekvS8pGvzjMvMzPKV95XHs8AtwNbajZJOB3YAm4DTgArwnZoqNwFnASuADwGfl/SRnGMzM7OcqIgZ5pJuAX4nItalr9cD6yLivenr+cBB4LyI2CPpGeBPIuJ/peU3A2dFxBW5B2dmZpnNbtPnDAG7qy8i4qikfcCQpBeApbXl6fPLGu0oTUTrAebPn3/+6tWrCwvazKwX7dq162BEDGbZR7uSxwJgvG7bYWBhWlZ9XV/2NhGxBdgCMDw8HJVKJd9Izcx6nKSxrPto12irCWCgbtsAcCQto668WmZmZiXUruQxCqypvkj7PM4ERiPiEPBcbXn6fLRNsZmZWYvyHqo7W9Jc4BTgFElzJc0G7gPOkbQ2Lf8S8ERE7EnfejewUdJiSauBq4FtecZmZmb5yfvKYyNwHLge+GT6fGNEjANrgc3AIeACoHYk1Y3APmAMeAS4PSIezDk2MzPLSSFDddvFHeZmZq2TtCsihrPsw8uTmJlZy5w8zMysZU4eZmbWMicPMzNrWbtmmJfS/Y8/w+0P7eXZF4+zdNE8rrt0FZedt6zTYZmZlV7fJo/7H3+GL+74CcdffR2AZ148zhd3/ATACcTMbAp922x1+0N730gcVcdffZ3bH9rboYjMzLpH3yaPZ1883tJ2MzN7U982Wy1dNI9nGiSKpYvmdSAaM7PGyto327dXHtdduop5p57ylm3zTj2F6y5d1aGIzMzeqto3+8yLxwne7Ju9//FnOh1a/yaPy85bxq2Xn8uyRfMQsGzRPG69/NxSZHQzMyh332zfNltBkkCcLMysrMrcN9vXySOrsrZFmllvKHPfbN82W2VV5rZIMyuX+x9/hvfd9n3+9fV/x/tu+/60vyfK3Dfr5DFDZW6LNLPyyPJDs8x9s262mqEyt0WaWXlM9kNzOkmgrH2zvvKYoWZtjmVoizSz8ujVH5ptTR6Sdkp6WdJE+thbU/YJSWOSjkq6X9Jp7YytVWVuizSz8ujVH5qduPLYEBEL0scqAElDwLeATwFLgGPA1zsQ27SVuS3SzMqjV39olqXPYwR4ICJ+ACBpE/CkpIURcaSzoTVX1rZIMyuP6ndErw3r70TyuFXSbcBe4IaI2AkMAT+qVoiIfZJeAc4GdnUgxsJ5johZ/+jFH5rtTh5fAH4GvAJcATwg6d3AAuBwXd3DwML6HUhaD6wHWL58eaHBFsX3EjGzbtfWPo+I+HFEHImIExFxF/AY8DFgAhioqz4AvK3JKiK2RMRwRAwPDg4WH3QBPEfErLvMdJJfL+t0n0cAAkaBNdWNks4A5gBPdSiuQvXq0D2zXuSWgsbaduUhaZGkSyXNlTRb0gjwAeAhYDvwcUnvlzQf+DKwo8yd5Vn06tA9s17kloLG2tlsdSpwCzAOHAQ+B1wWEXsjYhT4LEkS+RVJX8c1bYytrXp16J5ZL3JLQWNta7aKiHHgPZOU/xXwV+2Kp5N6deieWS8q88q2ndTpPo++1YtD98zKKsvQ+OsuXfWWPg9wSwE4eZhZj8va4e2WgsacPMysp2Vd1RbcUtCIk0eX8gx16zczPefd4V0MJ48u5HHn1m+ynPPu8C6G7+fRhTzu3PpNlnPeQ+OL4SuPLpTHZbibvaybZDnn3eFdDCePLpT1MjyPZi8nH2unrOe8O7zz52arLpT1Mjxrs1c1+Tzz4nGCN5OPF4uzorjpqXx85dGFsl6GZ232ymPoo/WfLFerbnoqHyePLpXlMjxrE4CHPlqr8mgqddNTubjZqg9lbQLIuiqw743QfzxCsPf4yqMPZW0CyLLWjzvru1eW4+6r1d7j5NGnsjQBZEk+WftLPEGyM7Ied0/U6z1OHjYjM00+7qzvTlmPu1em7T1OHtZW7qzvnE42O3m0VO9x8rC2yvoLNI/mj27uM5lp7GVodvJoqd7S36Ottm+HlSth1qzkz+3bOx1Rz7vsvGXcevm5LFs0DwHLFs3j1svPbamzPstIsW6e4Jgl9qyjnTxJz+qV5spD0mnAncCHSe5x/sX01rTF2L4d1q+HY8eS12NjyWuAkZHp7+OGG+DAAVi+HDZvnv57+1inOuur78vaYZ/lqiXL+7PE7mYny1tpkgfwl8ArwBLg3cDfSdodEaOFfNoNN7yZOKqOHUu2TycBZE0+TjwzliX5ZPkSzdr0k/X9WWJ3s5PlrRTNVpLmA2uBTRExERGPAn8DfKqwDz1woLXt9SZLPlOpJp6xMYh4M/G42axwWSY4Zm36yfr+LLG72cnyVorkAZwNvB4RT9Vs2w0M1VeUtF5SRVJlfHx85p+4fHlr2+tlST5ZEo9lkuVLNGvTT9b3Z4k9a1+TWb2yJI8FwOG6bYeBhfUVI2JLRAxHxPDg4ODMP3HzZnjHO9667R3vSLZPR5bkk/WqB9zZP0NZvkSzLsuS9f1ZE8Bl5y3jsev/gKdv+0Meu/4PnDgsk7L0eUwAA3XbBoAjhX1itX9hpv0Omze/tc8Dpp98li9PmqoabZ+OPDr7+9hM2+6zDjPOY6Kc+x2sLMpy5fEUMFvSWTXb1gDFdJZXjYzA/v1w8mTyZytfvCMjsGULrFgBUvLnli3T20fWqx43e3VEHr/83XRkvUIR0ekYAJD010AAnyYZbfX3wHsnG201PDwclUqlTRHmLMtoq1mzko72elKSCIv+fDPrapJ2RcRwln2UpdkK4BpgK/Ar4NfAnxU2TLcMRkZm/mXtZi8z67CyNFsREb+JiMsiYn5ELC90gmC3c7OXmXVYaZKHtSBLfwvkM9rLzPqak0e3ytLZn3WOC3iosFmfc/LoR1mbvTxD3qzvOXn0o6zNXu4zMet7pRmqOxNdPVS3m+UxVNjMOiaPobq+8rDW5dFnYmZdzcnDWpdHn4k72826mpOHtS5Ln4k72816gvs8rL1Wrmw8O37FimTIsZkVzn0e1n08QdGsJzh5WHu5s92sJzh5WHtl7Ww3s1Jw8rD2yjpBETxay6wEyrQku/WLLMvRezl5s1LwlYd1Fy+NYlYKTh7WXTxay6wUnDysu3i0llkptCV5SNop6WVJE+ljb135JySNSToq6X5Jp7UjLutCHq1lVgrtvPLYEBEL0seq6kZJQ8C3gE8BS4BjwNfbGJd1kzxGa5lZZmVothoBHoiIH0TEBLAJuFzSwg7HZWWV5S6K4KG+ZjloZ/K4VdJBSY9Juqhm+xCwu/oiIvYBrwBnN9qJpPWSKpIq4+PjhQZsPcgLM5rlol3J4wvAGcAyYAvwgKQz07IFwOG6+oeBhlceEbElIoYjYnhwcLCoeK1XeaivWS4yJ4+0MzyaPB4FiIgfR8SRiDgREXcBjwEfS3cxAQzU7XYAOJI1NrO38VBfs1xknmEeERfN5G2A0uejwJpqgaQzgDnAU1ljM3ub5csbLwnvob5mLSm82UrSIkmXSporabakEeADwENple3AxyW9X9J84MvAjojwlYflz0N9zXLRjj6PU4FbgHHgIPA54LKI2AsQEaPAZ0mSyK9I+jquaUNc1o881NcsF76ToFkrtm9POtcPHEiaujZvduKxrpPHnQS9qq7ZdHlFX7M3lGGSoFl38DBfszc4eZhNl4f5mr3BycNsuryir9kbnDzMpsvDfM3e4ORhNl0e5mv2Bo+2MmtFlvuvm/UQX3mYmVnLnDzM2sn3ErEe4WYrs3bxJEPrIb7yMGsXTzK0HuLkYdYunmRoPcTJw6xdPMnQeoiTh1m7eJKh9RAnD7N28SRD6yEebWXWTp5kaD3CVx5mZtayXJKHpA2SKpJOSNrWoPxiSXskHZP0sKQVNWVzJG2V9JKk5yVdm0dMZj3JkwytJPK68niW5D7lW+sLJJ0O7AA2AacBFeA7NVVuAs4CVgAfAj4v6SM5xWXWO6qTDMfGIOLNSYZOINYBuSSPiNgREfcDv25QfDkwGhH3RsTLJMlijaTVaflVwM0RcSgingTuANblEZdZT/EkQyuRdvR5DAG7qy8i4iiwDxiStBhYWluePh9qtjNJ69Mmssr4+HhBIZuVkCcZWom0I3ksAA7XbTsMLEzLqCuvljUUEVsiYjgihgcHB3MN1KzUPMnQSmTK5CFpp6Ro8nh0Gp8xAQzUbRsAjqRl1JVXy8yslicZWolMmTwi4qKIUJPHhdP4jFFgTfWFpPnAmST9IIeA52rL0+ejrf01zPqAJxlaieQ1VHe2pLnAKcApkuZKqk5AvA84R9LatM6XgCciYk9afjewUdLitBP9amBbHnGZ9ZyREdi/H06eTP504rAOyavPYyNwHLge+GT6fCNARIwDa4HNwCHgAuCKmvfeSNKBPgY8AtweEQ/mFJeZmRVAEdHpGGZseHg4KpVKp8MwM+sqknZFxHCWfXh5ErN+4dnpliMvjGjWD3wLXMuZrzzM+oFnp1vOnDzM+oFnp1vOnDzM+oFnp1vOnDzM+oFnp1vOnDzM+oFnp1vOPNrKrF/4FriWI195mJlZy5w8zMysZU4eZmbWMicPM5seL29iNdxhbmZT8/ImVsdXHmY2NS9vYnWcPMxsal7exOo4eZjZ1Ly8idVx8jCzqXl5E6uT1z3MN0iqSDohaVtd2UpJIWmi5rGppnyOpK2SXpL0vKRr84jJzHLk5U2sTl6jrZ4FbgEuBeY1qbMoIl5rsP0m4CxgBfAu4GFJP/N9zM1KxsubWI1crjwiYkdE3A/8egZvvwq4OSIORcSTwB3AujziMjOzYrSzz2NM0i8lfVvS6QCSFgNLgd019XYDQ812Iml92kRWGR8fLzZiMzNrqB3J4yDwHpJmqfOBhUB1auqC9M/DNfUPp3UaiogtETEcEcODg4MFhGtmZlOZMnlI2pl2eDd6PDrV+yNiIiIqEfFaRLwAbAA+LGkAmEirDdS8ZQA4MpO/jJmVmJc36SlTdphHxEU5f2akfyoiDkl6DlgD/GO6fQ0wmvNnmlkneXmTnpPXUN3ZkuYCpwCnSJoraXZadoGkVZJmSXon8DVgZ0RUm6ruBjZKWixpNXA1sC2PuMysJLy8Sc/Jq89jI3AcuB74ZPp8Y1p2BvAgSVPUT4ETwJU1770R2AeMAY8At3uYrlmP8fImPUcRMXWtkhoeHo5KpdLpMMxsKitXJk1V9VasgP372x1N35O0KyKGs+zDy5OYWfG8vEnPcfIws+J5eZOe45tBmVl7eHmTnuIrDzMza5mTh5mZtczJw8zMWubkYWZmLXPyMDOzljl5mFl38MKKpeKhumZWfl5YsXR85WFm5eeFFUvHycPMys8LK5aOk4eZld/y5a1tt8I5eZhZ+XlhxdJx8jCz8vPCiqXj0VZm1h28sGKp+MrDzMxaljl5SJoj6U5JY5KOSHpc0kfr6lwsaY+kY5IelrSi7v1bJb0k6XlJ12aNyczMipXHlcds4BfAB4HfAjYB35W0EkDS6cCOdPtpQAX4Ts37bwLOAlYAHwI+L+kjOcRlZmYFyZw8IuJoRNwUEfsj4mRE/C3wNHB+WuVyYDQi7o2Il0mSxRpJq9Pyq4CbI+JQRDwJ3AGsyxqXmZkVJ/c+D0lLgLOB0XTTELC7Wh4RR4F9wJCkxcDS2vL0+dAk+18vqSKpMj4+nnf4ZmY2DbkmD0mnAtuBuyJiT7p5AXC4ruphYGFaRl15tayhiNgSEcMRMTw4OJhP4GbW27yoYu6mTB6SdkqKJo9Ha+rNAu4BXgE21OxiAhio2+0AcCQto668WmZmll11UcWxMYh4c1FFJ5BMpkweEXFRRKjJ40IASQLuBJYAayPi1ZpdjAJrqi8kzQfOJOkHOQQ8V1uePh/FzCwPXlSxEHk1W30D+F3g4xFxvK7sPuAcSWslzQW+BDxR06x1N7BR0uK0E/1qYFtOcZlZv/OiioXIY57HCuAzwLuB5yVNpI8RgIgYB9YCm4FDwAXAFTW7uJGkA30MeAS4PSIezBqXmRngRRULknl5kogYAzRFnX8CVjcpOwH8afowM8vX5s1vvZEUeFHFHHh5EjPrbV5UsRBeGNHMep8XVcydrzzMzKxlTh5mZtYyJw8zM2uZk4eZmbXMycPMzFrm5GFmZi1z8jAzm4pX5X0bz/MwM5tMdVXe6gz16qq80NdzR3zlYWY2Ga/K25CTh5nZZLwqb0NOHmZmk/GqvA05eZiZTWbz5mQV3lpeldfJw8xsUl6VtyGPtjIzm4pX5X0bX3mYmVnL8rgN7RxJd0oak3RE0uOSPlpTvlJS1NyedkLSprr3b5X0kqTnJV2bNSYzMytWHs1Ws4FfAB8EDgAfA74r6dyI2F9Tb1FEvNbg/TcBZwErgHcBD0v6me9jbmZWXpmvPCLiaETcFBH7I+JkRPwt8DRw/jR3cRVwc0QciogngTuAdVnjMjOz4uTe5yFpCXA2MFpXNCbpl5K+Len0tO5iYCmwu6bebmAo77jMzCw/uSYPSacC24G7ImJPuvkg8B6SZqnzgYVpHYAF6Z+Ha3ZzOK3T7DPWS6pIqoyPj+cZvpmZTdOUyUPSzrTDu9Hj0Zp6s4B7gFeADdXtETEREZWIeC0iXkjLPixpAJhIqw3UfOQAcKRZPBGxJSKGI2J4cHCwpb+smZnlY8rkEREXRYSaPC4EkCTgTmAJsDYiXp1sl+mfiohDwHPAmpryNby9ycvMrHv14JLueU0S/Abwu8AlEXG8tkDSBcCLwP8FFgNfA3ZGRLWp6m5go6QKSfK5GviTnOIyM+usHl3SPY95HiuAzwDvBp6vmctRPSpnAA+SNEX9FDgBXFmzixuBfcAY8Ahwu4fpmlnP6NEl3RURU9cqqeHh4ahUKp0Ow8ysuVmzoNH3rAQnT7Y/HkDSrogYzrIPL09iZlakHl3S3cnDzKxIPbqku5OHmVmRenRJdy/JbmZWtB5c0t1XHmZm1jInDzMza5mTh5mZtczJw8zMWubkYWZmLXPyMDOzljl5mJlZy5w8zMysZU4eZmZlVtJ7gXiGuZlZWZX4XiC+8jAzK6sS3wvEycPMrKwOHGhtexs5eZiZlVWJ7wXi5GFmVlYlvhdILslD0v+Q9JyklyQ9JenTdeUXS9oj6Zikh9P7nlfL5kjamr73eUnX5hGTmVnXK/G9QHK5h7mkIeD/RcQJSauBncAfRsQuSacD+4BPAw8ANwPvj4jfT997K3Ah8B+BdwEPA+si4sGpPtf3MDcza11p7mEeEaMRcaL6Mn2cmb6+HBiNiHsj4mXgJmBNmmQArgJujohDEfEkcAewLo+4zMysGLnN85D0dZIv/XnA48Dfp0VDwO5qvYg4KmkfMCTpBWBpbXn6/LJJPmc9kA50ZkLS3hzCPx04mMN+ilDm2KDc8Tm2mSlzbFDu+LolthWTVZyO3JJHRFwj6XPAvwMuAqpXIguA8brqh4GFaVn1dX1Zs8/ZAmzJIeQ3SKpkvYQrSpljg3LH59hmpsyxQbnj66fYpmy2krRTUjR5PFpbNyJej4hHgd8B/izdPAEM1O12ADiSllFXXi0zM7OSmjJ5RMRFEaEmjwubvG02b/Z5jAJrqgWS5qdloxFxCHiutjx9PjqTv4yZmbVH5g5zSb8t6QpJCySdIulS4Erg+2mV+4BzJK2VNBf4EvBEROxJy+8GNkpanHaiXw1syxpXi3JtBstZmWODcsfn2GamzLFBuePrm9gyD9WVNAh8j+SKYRYwBnwtIu6oqXMJ8N9JOml+TDIUd39aNgf4BvCfgOPAVyLiv2UKyszMCpXLPA8zM+svXp7EzMxa5uRhZmYt64vkIek0SfdJOippTNInmtSTpK9I+nX6+KokFRzbHEl3pnEdkfS4pI82qbtO0uuSJmoeFxUc305JL9d8XsNJme0+dnXHYCI9Ln/RpG7hx03SBkkVSSckbasra7q2W4P9rEzrHEvfc0lRsUn6fUn/KOk3ksYl3SvpX02yn2mdCznGtzKdElD777Zpkv2089iN1MV1LI31/Cb7yf3YTfXdUfR51xfJA/hL4BVgCTACfEPJelz11pPMbl8D/B7wH4DPFBzbbOAXwAeB3wI2Ad+VtLJJ/f8dEQtqHjsLjg9gQ83nrWpSp63HrvYYkPy7HgfuneQtRR+3Z4FbgK21G5Ws7baD5N/1NKACfGeS/fxPkhUa3gncAHxPyaCU3GMDFpOMwFlJMpjlCPDtKfY1nXMhr/iqFtV85s2T7Kdtxy4ittedg9cAPwf+ZZJ95X3smn53tOW8i4iefgDzSRLH2TXb7gFua1D3R8D6mtf/GfjnDsT8BLC2wfZ1wKNtjmUn8Olp1OvYsQP+mOQ/rpqUt+24kXzRbKt5vR74Uc3r+SSJbnWD955NsjLDwpptPwQ+W0RsDcr/LXAk67mQ47FbSbJO3uxpvLfTx+5h4MZOHbuaz3kCWNuO864frjzOBl6PiKdqtu0mWXOr3lvW4ZqkXmEkLSGJudlEyfMkHVSy9P0mSe24D/2t6Wc+NklzTyeP3R8Dd0d61jfRieMGDdZ2I1llutn59/OIqF1hoZ3H8QNMPUF3OudC3sYk/VLSt9Nf1I107NilzUEfIJmzNplCj13dd0fh510/JI8FvHXtLGi+flZ93cPAgiLb7mtJOhXYDtwVb06irPUD4Bzgt0l+XVwJXFdwWF8AzgCWkTRxPCDpzAb1OnLsJC0nuWy/a5JqnThuVVnOv8nq5krS75FM4J3suEz3XMjLQeA9JE1q55Mch+1N6nbs2JGsDP7DiHh6kjqFHrsG3x2Fn3f9kDwmW1trqroDwMQUv2hzIWkWSXPaK8CGRnUi4ucR8XREnIyInwBfJplcWZiI+HFEHImIExFxF/AY8LEGVTt17K4iaZJq+h+3E8etRpbzb7K6uZH0b4B/AP5LRPywWb0WzoVcRMRERFQi4rWIeIHk/8WHJdUfI+jQsUtdxeQ/Xgo9dk2+Owo/7/oheTwFzJZ0Vs22ZutnvWUdrknq5Sr9dX4nScfv2oh4dZpvDaAtV0XT+MyOHDum8R+3gXYet6ZruzWpe4ak2l98hR7HtMnln0juqXNPi29v9/lX/SHS7Pxr67EDkPQ+kttKfK/Ft+Zy7Cb57ij+vCu6A6cMD+CvSUYTzAfeR3JJNtSg3meBJ0kuLZemBy+XDrcp4vsm8M/AginqfRRYkj5fDfyUSTrpcohrEXApMJdkZMcIcBRYVYZjB7w3jWfhFPUKP27p8ZkL3EryK7B6zAbT821tuu0rTDKQID0P/jyt+0fAi8BgQbEtI2kHvy7PcyHH+C4AVpH8yH0nyWihh8tw7GrKt5D0t3Xq2DX87mjHeZfbf54yP0iGqt2f/oMdAD6Rbn8/SdNKtZ6ArwK/SR9fpckInhxjW0HyK+RlksvH6mMEWJ4+X57W/XPghfTv8XOS5pdTC4xtEPg/JJevL6Yn2L8v0bH7FnBPg+1tP24kd8iMusdNadklwB6S0S47gZU17/sm8M2a1yvTOseBvcAlRcUG3Jg+rz3vav9N/yvwD1OdCwXGdyXwdPrv9hxJh/S7ynDs0rK56bG4uMH7Cj92TPLd0Y7zzmtbmZlZy/qhz8PMzHLm5GFmZi1z8jAzs5Y5eZiZWcucPMzMrGVOHmZm1jInDzMza5mTh5mZtez/A8bIaIXveo0oAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "show_preds(preds)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This doesn't look very close--our random parameters suggest that the roller coaster will end up going backwards, since we have negative speeds!\n", + "\n", + "Step 3--Calculate the *loss*:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor(25823.8086, grad_fn=)" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "loss = mse(preds, speed)\n", + "loss" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our goal is now to improve this. To do that, we'll need to know the gradients.\n", + "\n", + "Step 4--Calculate the *gradients*. In other words, calculate an approximation of how the parameters need to change." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([-53195.8594, -3419.7146, -253.8908])" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "loss.backward()\n", + "params.grad" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([-0.5320, -0.0342, -0.0025])" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "params.grad * 1e-5" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can use these gradients to improve our parameters. We'll need to pick a learning rate (we'll discuss how to do that in practice in the next chapter; for now we'll just pick `1.0`):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor([-0.7658, -0.7506, 1.3525], requires_grad=True)" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "params" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Step 5--*Step* the weights. In other words, update the parameters based on the gradients we just calculated." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "lr = 1e-5\n", + "params.data -= lr * params.grad.data\n", + "params.grad = None" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if the loss has improved:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor(5435.5366, grad_fn=)" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "preds = f(time,params)\n", + "mse(preds, speed)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "...and take a look at the plot:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEACAYAAABLfPrqAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAcYElEQVR4nO3df4wc5Z3n8ffHGGFje9YmzJKzVx4vHNgrwzqcJ+IO8sMJuZCwx4nF9wdkEta3G5wscnQSJxJy2AEFkElY3R/Zu/wwgpgfXjYhMmhJduGyG0yCcxvdEGSSCTaSA0P4mXFijMc25tf3/qjq0HS6Z6amqrpruj8vqTTd9TxV/Z1navrbz/NUVSsiMDMzy2JWpwMwM7OZx8nDzMwyc/IwM7PMnDzMzCwzJw8zM8vMycPMzDJz8jAzs8wKTR6SNkgalnRU0taGsnMl7ZZ0WNKDkgbqyo6TdKuklyW9IOmKIuMyM7NiFd3zeA64Hri1fqWkE4HtwCbgBGAY+FZdlWuBU4EB4APAZyV9pODYzMysICrjCnNJ1wN/FBHr0ufrgXURcXb6fB6wDzgzInZLehb4rxHxf9Ly64BTI+LiwoMzM7PcZrfpdVYCu2pPIuKQpL3ASkkvAovry9PHFzbbUZqI1gPMmzdv9YoVK0oL2sysGz3yyCP7IqI/zz7alTzmA2MN6w4AC9Ky2vPGst8TEVuALQCDg4MxPDxcbKRmZl1O0mjefbTrbKtxoK9hXR9wMC2jobxWZmZmFdSu5DECrKo9Sec8TgFGImI/8Hx9efp4pE2xmZlZRkWfqjtb0hzgGOAYSXMkzQbuAU6XtDYt/wLwWETsTje9HdgoaZGkFcBlwNYiYzMzs+IU3fPYCBwBrgI+nj7eGBFjwFrgBmA/cBZQfybVNcBeYBR4CLgpIu4vODYzMytIKafqtosnzM3MspP0SEQM5tmHb09iZmaZOXmYmVlmTh5mZpaZk4eZmWXWrivMK+neR5/lpgf28NxLR1i8cC5XnrecC89c0umwzMwqr2eTx72PPsvnt/+MI6+9AcCzLx3h89t/BuAEYmY2iZ4dtrrpgT2/Sxw1R157g5se2NOhiMzMZo6eTR7PvXQk03ozM3tLzw5bLV44l2ebJIrFC+d2IBozs+aqOjfbsz2PK89bztxjj3nburnHHsOV5y3vUERmZm9Xm5t99qUjBG/Nzd776LOdDq13k8eFZy5h80VnsGThXAQsWTiXzRedUYmMbmYG1Z6b7dlhK0gSiJOFmVVVledmezp55FXVsUgz6w5Vnpvt2WGrvKo8Fmlm1XLvo89yzo0/4I+v+h7n3PiDKb9PVHlu1sljmqo8Fmlm1ZHng2aV52Y9bDVNVR6LNLPqmOiD5lSSQFXnZt3zmKZWY45VGIs0s+ro1g+abU0eknZIekXSeLrsqSv7mKRRSYck3SvphHbGllWVxyLNrDq69YNmJ3oeGyJifrosB5C0EvgG8AngJOAw8NUOxDZlVR6LNLPq6NYPmlWZ8xgC7ouIHwJI2gQ8LmlBRBzsbGitVXUs0syqo/Ye0W2n9XcieWyWdCOwB7g6InYAK4Ef1ypExF5JrwKnAY90IMbS+RoRs97RjR802508Pgf8AngVuBi4T9K7gPnAgYa6B4AFjTuQtB5YD7B06dJSgy2Lv0vEzGa6ts55RMRPIuJgRByNiNuAncD5wDjQ11C9D/i9IauI2BIRgxEx2N/fX37QJfA1ImYzy3Qv8utmnZ7zCEDACLCqtlLSycBxwBMdiqtU3Xrqnlk38khBc23reUhaKOk8SXMkzZY0BLwPeADYBlwg6b2S5gFfBLZXebI8j249dc+sG3mkoLl2DlsdC1wPjAH7gM8AF0bEnogYAT5NkkR+TTLXcXkbY2urbj11z6wbeaSgubYNW0XEGPDuCcr/Dvi7dsXTSd166p5ZN6rynW07qdNzHj2rG0/dM6uqPKfGX3ne8rfNeYBHCsDJw8y6XN4Jb48UNOfkYWZdLe9dbcEjBc04ecxQvkLdes10j3lPeJfDyWMG8nnn1mvyHPOe8C6Hv89jBvJ559Zr8hzzPjW+HO55zEBFdMM97GUzSZ5j3hPe5XDymIHydsOLGPZy8rF2ynvMe8K7eB62moHydsPzDnvVks+zLx0heCv5+GZxVhYPPVWPex4zUN5ueN5hryJOfbTek6e36qGn6nHymKHydMPzDgH41EfLqoihUg89VYuHrXpQ3iGAvHcF9ncj9B6fIdh93PPoQXmHAPLc68eT9TNXnnZ3b7X7OHn0qDxDAHmST975El8g2Rl5290X6nUfJw+blukmH0/Wz0x52913pu0+Th7WVp6s75xODjv5bKnu4+RhbZX3E2gRwx8zec5kurFXYdjJZ0t1l94+22rbNli2DGbNSn5u29be7XvQhWcuYfNFZ7Bk4VwELFk4l80XnZFpsj7PmWIz+QLHPLHnPdvJF+lZo8r0PCSdANwCfJjkO84/n341bTm2bYP16+Hw4eT56GjyHGBoqD3bX301PP00LF0KN9wwte26QKcm62vb5Z2wz9NrybN9ntg97GRFU0R0OgYAJN1F0hP6K+BdwPeAsyNipNU2g4ODMTw8PL0XXLYsecNvNDAATz1V7vaNiQfg+ONhy5apJ5AeTj55/PFV36PZES/gyRv/bMJtG4d+IPn0PdWeU97t88R+zo0/aDrstGThXHZe9cFJX9u6i6RHImIwzz4qMWwlaR6wFtgUEeMR8TDwD8AnSnvRp5/Otr7I7a+++u2JA5LnV189tdeuJZ/RUYh4q9fjYbNJ5bnAMe/QT97t88TuYScrWiWSB3Aa8EZEPFG3bhewsrGipPWShiUNj42NTf8Vly7Ntr7I7fMmrrzJB3p2vibPm2jeoZ+82+eJPe9ck1mjqiSP+cCBhnUHgAWNFSNiS0QMRsRgf3//9F/xhhuSoaJ6xx+frC97+7yJK2/yKaLnMkOTT5430by3Zcm7fd4EcOGZS9h51Qd58sY/Y+dVH3TisFwqMech6UxgZ0QcX7fuvwNrIuKCVtvlmvOA/PMG090+75xHJ+droJg5mxmo03MeZkUpYs6DiOj4AswDXgVOrVt3O3DjRNutXr06Zqw774wYGIiQkp933plt2+OPj0j6Dcly/PFT34f09m1rizS17QcGmm8/MDD1+Kf7u3fYPT99Js7e/C+x7HPfjbM3/0vc89Nn2rq9WRGA4cj5vl2JngeApL8HAvgkydlW/0iZZ1vNdHl6TXl7HrNmJemikQRvvjnxtj3aazGrkq452yp1OTAX+DVwF/DXEyWOnjc0lLzRv/lm8jPLG2/e+Z48czae7DfrCpVJHhHx24i4MCLmRcTSKPMCwV43NJR80h8YSHoLAwPZPvnnST5VmOw3s9wqkzyszfL0XPIkn7xnmrnnYlYJTh42PdNNPnmHzNxzMasEJw9rr7xDZlXouZiZk4d1QCcn+4vouXjIy8zJw2aYTvZcPORl9juVuc5jOnr6Og+bnjzXmeS9PsasIrrtOg+z8uXpueQd8gIPe1nXqMyXQZm1zdDQ9K5mX7q0ec9jqpP1eb9AzKxC3PMwm6q8k/U+08u6iJOH2VTlnaz3sJd1EQ9bmWUx3SEv8LCXdRX3PMzaxcNe1kWcPMzapdPDXh7ysgJ52MqsnTo17OUhLyuYex5mM0WeYS8PeVnBnDzMZopOX+BoVsfJw2wmme5NJfPejRg8Z2Jv05bkIWmHpFckjafLnobyj0kalXRI0r2STmhHXGY9I++ZXr4ppDVoZ89jQ0TMT5fltZWSVgLfAD4BnAQcBr7axrjMul/eM708Z2INqjBsNQTcFxE/jIhxYBNwkaQFHY7LrLvk+R4VXx1vDdqZPDZL2idpp6Q1detXArtqTyJiL/AqcFqznUhaL2lY0vDY2FipAZtZKu+ciYe9uk67ksfngJOBJcAW4D5Jp6Rl84EDDfUPAE17HhGxJSIGI2Kwv7+/rHjNrJ6vjrcGuZNHOhkeLZaHASLiJxFxMCKORsRtwE7g/HQX40Bfw277gIN5YzOzgnT66nirnNzJIyLWRIRaLO9ptRmg9PEIsKpWIOlk4DjgibyxmVmB8syZ+FThrlP6sJWkhZLOkzRH0mxJQ8D7gAfSKtuACyS9V9I84IvA9ohwz8OsW/hU4a7TjjmPY4HrgTFgH/AZ4MKI2AMQESPAp0mSyK9J5joub0NcZtYuPlW46ygiOh3DtA0ODsbw8HCnwzCzss2alfQ4GknJMJplIumRiBjMs48qXOdhZjaxIuZMrFBOHmZWfUXMmXiyvVBOHmZWfXnmTDzZXgrPeZhZd1u2rPmXaA0MJKcc9yDPeZiZTcYXKJbCycPMupsvUCyFk4eZdTdfoFgKJw8z626+QLEUnjA3M5tIF16g6AlzM7Oy+QLFppw8zMwmknfOBLpywt3Jw8xsInnnTLp0wt1zHmZmZargRYqe8zAzq7ouvUjRycPMrExdOuHu5GFmVqYuvSOwk4eZWZm69I7AhSQPSRskDUs6Kmlrk/JzJe2WdFjSg5IG6sqOk3SrpJclvSDpiiJiMjOrjKGhZHL8zTeTn11wdXtRPY/nSL6n/NbGAkknAtuBTcAJwDDwrboq1wKnAgPAB4DPSvpIQXGZmc1cFZ5sLyR5RMT2iLgX+E2T4ouAkYi4OyJeIUkWqyStSMsvBa6LiP0R8ThwM7CuiLjMzGa0Ck+2t2POYyWwq/YkIg4Be4GVkhYBi+vL08crW+1M0vp0iGx4bGyspJDNzCqgiKvbS9KO5DEfONCw7gCwIC2jobxW1lREbImIwYgY7O/vLzRQM7NKyXt1e4lmT1ZB0g7g/S2Kd0bEeybZxTjQ17CuDziYltWev9JQZmZmQ0OVSBaNJu15RMSaiFCLZbLEATACrKo9kTQPOIVkHmQ/8Hx9efp4JNuvYWZm7VTUqbqzJc0BjgGOkTRHUq1Xcw9wuqS1aZ0vAI9FxO60/HZgo6RF6ST6ZcDWIuIyM7NyFDXnsRE4AlwFfDx9vBEgIsaAtcANwH7gLODium2vIZlAHwUeAm6KiPsLisvMzErgu+qamfUY31XXzMw6wsnDzMwyc/IwM7PMnDzMzCwzJw8zM8vMycPMzDJz8jAzs8ycPMzMLDMnDzMzy8zJw8zMMnPyMDOzzJw8zMwsMycPMzPLzMnDzMwyc/IwM7PMnDzMzCwzJw8zM8usqO8w3yBpWNJRSVsbypZJCknjdcumuvLjJN0q6WVJL0i6ooiYzMysPLML2s9zwPXAecDcFnUWRsTrTdZfC5wKDADvBB6U9At/j7mZWXUV0vOIiO0RcS/wm2lsfilwXUTsj4jHgZuBdUXEZWZm5WjnnMeopGckfVPSiQCSFgGLgV119XYBK1vtRNL6dIhseGxsrNyIzcysqXYkj33Au0mGpVYDC4Btadn89OeBuvoH0jpNRcSWiBiMiMH+/v4SwjUzs8lMmjwk7UgnvJstD0+2fUSMR8RwRLweES8CG4APS+oDxtNqfXWb9AEHp/PLmJlZe0w6YR4Rawp+zUh/KiL2S3oeWAV8P12/Chgp+DXNzKxARZ2qO1vSHOAY4BhJcyTNTsvOkrRc0ixJ7wC+AuyIiNpQ1e3ARkmLJK0ALgO2FhGXmZmVo6g5j43AEeAq4OPp441p2cnA/SRDUT8HjgKX1G17DbAXGAUeAm7yabpmZtWmiJi8VkUNDg7G8PBwp8MwM5tRJD0SEYN59uHbk5iZWWZOHmZmlpmTh5mZZebkYWZmmTl5mJlZZk4eZmaWmZOHmZll5uRhZmaZOXmYmVlmTh5mZpaZk4eZmWXm5GFmZpk5eZiZWWZOHmZmlpmTh5mZZebkYWZmmTl5mJlZZrmTh6TjJN0iaVTSQUmPSvpoQ51zJe2WdFjSg5IGGra/VdLLkl6QdEXemMzMrFxF9DxmA78C3g/8AbAJ+LakZQCSTgS2p+tPAIaBb9Vtfy1wKjAAfAD4rKSPFBCXmZmVJHfyiIhDEXFtRDwVEW9GxHeBJ4HVaZWLgJGIuDsiXiFJFqskrUjLLwWui4j9EfE4cDOwLm9cZmZWnsLnPCSdBJwGjKSrVgK7auURcQjYC6yUtAhYXF+ePl45wf7XSxqWNDw2NlZ0+GZmNgWFJg9JxwLbgNsiYne6ej5woKHqAWBBWkZDea2sqYjYEhGDETHY399fTOBmZpbJpMlD0g5J0WJ5uK7eLOAO4FVgQ90uxoG+ht32AQfTMhrKa2VmZlZRkyaPiFgTEWqxvAdAkoBbgJOAtRHxWt0uRoBVtSeS5gGnkMyD7Aeery9PH49gZmaVVdSw1deAPwEuiIgjDWX3AKdLWitpDvAF4LG6Ya3bgY2SFqWT6JcBWwuKy8zMSlDEdR4DwKeAdwEvSBpPlyGAiBgD1gI3APuBs4CL63ZxDckE+ijwEHBTRNyfNy4zMyvP7Lw7iIhRQJPU+WdgRYuyo8BfpouZmc0Avj2JmZll5uRhZmaZOXmYmVlmTh5mZpaZk4eZmWXm5GFmZpk5eZiZWWZOHmZmlpmTh5mZZebkYWZmmTl5mJlZZk4eZmaWmZOHmZll5uRhZmaZOXmYmVlmTh5mZpaZk4eZmWVWxNfQHifpFkmjkg5KelTSR+vKl0mKuq+nHZe0qWH7WyW9LOkFSVfkjcnMzMqV+2to0338Cng/8DRwPvBtSWdExFN19RZGxOtNtr8WOBUYAN4JPCjpF/4eczOz6srd84iIQxFxbUQ8FRFvRsR3gSeB1VPcxaXAdRGxPyIeB24G1uWNy8zMylP4nIekk4DTgJGGolFJz0j6pqQT07qLgMXArrp6u4CVRcdlZmbFKTR5SDoW2AbcFhG709X7gHeTDEutBhakdQDmpz8P1O3mQFqn1WuslzQsaXhsbKzI8M3MbIomTR6SdqQT3s2Wh+vqzQLuAF4FNtTWR8R4RAxHxOsR8WJa9mFJfcB4Wq2v7iX7gIOt4omILRExGBGD/f39mX5ZMzMrxqQT5hGxZrI6kgTcApwEnB8Rr020y9pmEbFf0vPAKuD76fpV/P6Ql5mZVUhRw1ZfA/4EuCAijtQXSDpL0nJJsyS9A/gKsCMiakNVtwMbJS2StAK4DNhaUFxmZlaCIq7zGAA+BbwLeKHuWo6htMrJwP0kQ1E/B44Cl9Tt4hpgLzAKPATc5NN0zcyqLfd1HhExCmiC8ruAuyYoPwr8ZbqYmdkM4NuTmJlZZk4eZmaWmZOHmZll5uRhZmaZOXmYmVlmTh5mZpaZk4eZmWXm5GFmZpk5eZiZWWZOHmZmlpmTh5mZZebkYWZmmTl5mJlZZk4eZmaWmZOHmZll5uRhZmaZOXmYmVlmTh5mZpZZIclD0p2Snpf0sqQnJH2yofxcSbslHZb0YPq957Wy4yTdmm77gqQriojJzMzKU1TPYzOwLCL6gP8MXC9pNYCkE4HtwCbgBGAY+FbdttcCpwIDwAeAz0r6SEFxmZlZCQpJHhExEhFHa0/T5ZT0+UXASETcHRGvkCSLVZJWpOWXAtdFxP6IeBy4GVhXRFxmZlaO2UXtSNJXSd705wKPAv+YFq0EdtXqRcQhSXuBlZJeBBbXl6ePL5zgddYD69On45L2FBD+icC+AvZThirHBtWOz7FNT5Vjg2rHN1NiG5io4lQUljwi4nJJnwH+A7AGqPVE5gNjDdUPAAvSstrzxrJWr7MF2FJAyL8jaTgiBovcZ1GqHBtUOz7HNj1Vjg2qHV8vxTbpsJWkHZKixfJwfd2IeCMiHgb+CPjrdPU40New2z7gYFpGQ3mtzMzMKmrS5BERayJCLZb3tNhsNm/NeYwAq2oFkualZSMRsR94vr48fTwynV/GzMzaI/eEuaQ/lHSxpPmSjpF0HnAJ8IO0yj3A6ZLWSpoDfAF4LCJ2p+W3AxslLUon0S8DtuaNK6NCh8EKVuXYoNrxObbpqXJsUO34eiY2RUS+HUj9wHdIegyzgFHgKxFxc12dDwH/i2SS5ifAuoh4Ki07Dvga8F+AI8CXIuJ/5grKzMxKlTt5mJlZ7/HtSczMLDMnDzMzy6wnkoekEyTdI+mQpFFJH2tRT5K+JOk36fJlSSo5tuMk3ZLGdVDSo5I+2qLuOklvSBqvW9aUHN8OSa/UvV7TizLb3XYNbTCetsvftqhbertJ2iBpWNJRSVsbylre263JfpaldQ6n23yorNgk/XtJ35f0W0ljku6W9G8m2M+UjoUC41uWXhJQ/3fbNMF+2tl2Qw1xHU5jXd1iP4W33WTvHWUfdz2RPID/DbwKnAQMAV+TtLJJvfUkV7evAv4U+E/Ap0qObTbwK+D9wB+Q3APs25KWtaj/fyNift2yo+T4ADbUvd7yFnXa2nb1bUDydz0C3D3BJmW323PA9cCt9Ss1+b3dGt1FcoeGdwBXA99RclJK4bEBi0jOwFlGcjLLQeCbk+xrKsdCUfHVLKx7zesm2E/b2i4itjUcg5cDvwR+OsG+im67lu8dbTnuIqKrF2AeSeI4rW7dHcCNTer+GFhf9/yvgH/tQMyPAWubrF8HPNzmWHYAn5xCvY61HfAXJP+4alHetnYjeaPZWvd8PfDjuufzSBLdiibbnkZyZ4YFdet+BHy6jNialP874GDeY6HAtltGcp+82VPYttNt9yBwTafaru51HgPWtuO464Wex2nAGxHxRN26XST33Gr0tvtwTVCvNJJOIom51YWSZ0rap+TW95skFXaLmQlsTl9z5wTDPZ1su78Abo/0qG+hE+0GTe7tBuyl9fH3y4iov8NCO9vxfUx+ge5UjoWijUp6RtI300/UzXSs7dLhoPeRXLM2kVLbruG9o/TjrheSx3zefu8saH3/rMa6B4D5ZY7d15N0LLANuC3euoiy3g+B04E/JPl0cQlwZclhfQ44GVhCMsRxn6RTmtTrSNtJWkrSbb9tgmqdaLeaPMffRHULJelPSS7gnahdpnosFGUf8G6SIbXVJO2wrUXdjrUdyZ3BfxQRT05Qp9S2a/LeUfpx1wvJY6J7a01Wtw8Yn+QTbSEkzSIZTnsV2NCsTkT8MiKejIg3I+JnwBdJLq4sTUT8JCIORsTRiLgN2Amc36Rqp9ruUpIhqZb/uJ1otzp5jr+J6hZG0r8F/gn4bxHxo1b1MhwLhYiI8YgYjojXI+JFkv+LD0tqbCPoUNulLmXiDy+ltl2L947Sj7teSB5PALMlnVq3rtX9s952H64J6hUq/XR+C8nE79qIeG2KmwbQll7RFF6zI23HFP5xm2hnu7W8t1uLuidLqv/EV2o7pkMu/0zynTp3ZNy83cdf7YNIq+OvrW0HIOkckq+V+E7GTQtpuwneO8o/7sqewKnCAvw9ydkE84BzSLpkK5vU+zTwOEnXcnHaeIVMuE0S39eBfwXmT1Lvo8BJ6eMVwM+ZYJKugLgWAucBc0jO7BgCDgHLq9B2wNlpPAsmqVd6u6XtM4fkWzXvqGuz/vR4W5uu+xITnEiQHgd/k9b9c+AloL+k2JaQjINfWeSxUGB8ZwHLST7kvoPkbKEHq9B2deVbSObbOtV2Td872nHcFfbPU+WF5FS1e9M/2NPAx9L17yUZWqnVE/Bl4Lfp8mVanMFTYGwDJJ9CXiHpPtaWIWBp+nhpWvdvgBfT3+OXJMMvx5YYWz/w/0i6ry+lB9h/rFDbfQO4o8n6trcbyTdkRsNybVr2IWA3ydkuO0i+srm23deBr9c9X5bWOQLsAT5UVmzANenj+uOu/m/6P4B/muxYKDG+S4An07/b8yQT0u+sQtulZXPStji3yXaltx0TvHe047jzva3MzCyzXpjzMDOzgjl5mJlZZk4eZmaWmZOHmZll5uRhZmaZOXmYmVlmTh5mZpaZk4eZmWX2/wExahUKOb/yYgAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "show_preds(preds)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We need to repeat this a few times, so we'll create a function to apply one step:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def apply_step(params, prn=True):\n", + " preds = f(time, params)\n", + " loss = mse(preds, speed)\n", + " loss.backward()\n", + " params.data -= lr * params.grad.data\n", + " params.grad = None\n", + " if prn: print(loss.item())\n", + " return preds" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "...now we're ready for step 6!\n", + "\n", + "Step 6--*Repeat* the process. By looping and performing many improvements, we hope to reach a good result." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "5435.53662109375\n", + "1577.4495849609375\n", + "847.3780517578125\n", + "709.22265625\n", + "683.0757446289062\n", + "678.12451171875\n", + "677.1839599609375\n", + "677.0025024414062\n", + "676.96435546875\n", + "676.9537353515625\n" + ] + } + ], + "source": [ + "for i in range(10): apply_step(params)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#hide\n", + "params = orig_params.detach().requires_grad_()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Looking only at these loss numbers disguises the fact that each iteration represents an entirely different quadratic function being tried, on the way to find the best possible quadratic function. We can see this process visually if, instead of printing out the loss function, we plot the function at every step. Then we can see how the shape is approaching the best possible quadratic function for our data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1QAAADMCAYAAAB0vOLuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3df4zc9Z3f8dfbLIe9BsemGFJMdn1CHD4Z5CA2ihroBUJVuOS4InxSaCbiSARbtUcvVw6DqY0hgIuJo2suIuW0BMQB2wuXE6Y6LgdVjx8VrlplCYHUxERK8RpsI0xrG/Can/vuH98ZdnY83+/MfOczM98fz4c0Ws/3M7P+7tf78nfe38/n+/mYuwsAAAAA0LkFg94BAAAAAMgrCioAAAAASImCCgAAAABSoqACAAAAgJQoqAAAAAAgJQoqAAAAAEiJggoAAAAAUgpaUJnZtWY2ZWbvm9kDDW0XmdlOM5sxs6fNbLSu7Tgzu9/M3jazN8zsupD7BeQReQLCIlNAWGQKiITuodor6Q5J99dvNLOTJD0q6WZJJ0qakvRI3UtulXSGpFFJF0q6wcwuCbxvQN6QJyAsMgWERaYASebu4b+p2R2STnP3q6rPxyVd5e5fqD5fLOktSee4+04z2yPpG+7+X6vtt0s6w92vCL5zQM6QJyAsMgWERaZQdv26h2q1pBdrT9z9sKRfS1ptZssknVrfXv3z6j7tG5A35AkIi0wBYZEplMpQn/6e4yXtb9h2SNIJ1bba88a2o1SveoxL0uLFi89dtWpV2D0Fmnj++effcvflg96PqmB5ksgUBqOomSJPGBQyBYTVSab6VVC9K2lJw7Ylkt6pttWev9fQdhR3n5A0IUljY2M+NTUVfGeBRmY2Peh9qBMsTxKZwmAUNVPkCYNCpoCwOslUv4b87ZC0pvakOpb2dEk73P2ApH317dU/7+jTvgF5Q56AsMgUEBaZQqmEnjZ9yMwWSjpG0jFmttDMhiRtk3SWma2ttm+S9JK776y+9UFJG81smZmtknSNpAdC7huQN+QJCItMAWGRKSASuodqo6QjktZL+nr1zxvdfb+ktZI2Szog6fOS6mdyuUXRzYrTkp6VtNXdnwi8b0DekCcgLDIFhEWmAPVo2vR+YSwt+sXMnnf3sUHvR6+RKfRLGTJFntBPZAoIq5NM9eseKgAAAAAonH7N8jcQj72wR1uffEV7Dx7RqUsXad3FZ+qyc1YMereA3CJTQDjkCQiLTGFQCltQPfbCHt306C905MOPJUl7Dh7RTY/+QpIIF5ACmQLCIU9AWGQKg1TYIX9bn3zlk1DVHPnwY2198pUB7RGQb2QKCIc8AWGRKQxSYXuo9h480tF2lAvDAjpHppCETHWGPKEVMtUZMoUkvc5TYXuoTl26qKPtKI/asIA9B4/INTcs4LEX9gx61zKNTCEOmeoceUISMtU5MoU4/chTYQuqdRefqUXHHjNv26Jjj9G6i88c0B4hKxgWkA6ZQhwy1TnyhCRkqnNkCnH6kafCDvmrdePFde/RlV5eDAtIh0whDpnqXKs8SWSqzMhU5zhHIU4/8lTYgkqKwtUsLMwEUw5x/3meunSR9jQJEcMCWiNT5UamworLk0SmyoJMhcU5qtwGmafCDvlLQld68SWNl2VYQHhkqvjIVH+RqeIjU/1Dnopv0HkqZUFFV3rxJf3nedk5K3Tn5WdrxdJFMkkrli7SnZefzVWqLpCp4iNT/UWmio9M9Q95Kr5B56nQQ/7i0JVefK3+80waaoPOkaniI1P9RaaKj0z1D3kqvkHnqZQ9VHSlFx/Tp/YXmSo+MtVfZKr4yFT/kKfiG3SeSllQter6e+yFPTpvy1P6zfV/p/O2PMW6DznEf579RaaKj0z1F5kqPjLVP+Sp+Aadp1IO+ZOYCabo2pmSGGGRqWIjU/1HpoqNTPUXeSq2QeeptAVVnFY3tSFbktaVYPx5NpCpfCFT2Uem8qPV2kdkavDIU75k9RxFQdWAmWDyg6tK+UCm8oNM5QOZygfylA/kKT+ynKlS3kOVZNA3taF9rCuRD2QqP8hUPpCpfCBP+UCe8iPLmaKgajDom9rQPq4q5QOZyg8ylQ9kKh/IUz6Qp/zIcqYY8tdg0De14Whx42VZVyIfyFT2kKl8I1PZQp7yjTxlTx4zRUHVBDeJZkfSeNl1F585r03iqlJWkansIFPFQKaygTwVA3nKjrxmioKqQ61m7EE6ccc1abzs9vVfksRVpbwjU+ElHVMyVXxkKjzOUeVGpsIrWqYoqDqQ5dlF8izpuLYaL8tVpXwjU+G1OqZkqtjIVHico8qNTIU3kExNTkobNki7d0sjI9LmzVKlku4HaIJJKTrQanYRVtpOJ+m4MvtOsZGp8FodUzJVbGQqPM5R5UamwutZpiYnpZUrpQULoq+Tk3Pbx8el6WnJPfo6Pj7XHgAFVQeSquZatb3n4BG55qrt+mARuuaSjiuz7xRbN5kiT821urpHpoqNTIXHOarcyFR4PclUUtG0YYM0MzP/9TMz0fZAKKg6kFQ1t3MFo1XBVVZJx/Wyc1bozsvP1oqli2SSVixdpDsvP5tu9oJImynyFK/V1T0yVWxkKjzOUeVGpsLrKlNxvVBJRdPu3c13JG57ChRUHUiqmltdFc7yYmT9EnelptXViMvOWaHt67+kV7d8RdvXf4kTVYGkzRR5Sp8niUwVGZlKj3MUmiFT6aXO1MvPaPtffFOvfudSbf+Lb+qyl5+JXpTUC5VUNI2MNG+L254Ck1J0IGmtgq1PvpI4N36rgqvoM8i0c1NnkX9+NJc2U+SJPKE5MpUOmUIcMpVOq0yt+Mk2febu23Xywf16c+lyvXb9zfrcOZfMFU213qZa0SQl90KNjESvbVSbgKL+e0rS8HC0PRBz92DfrN/GxsZ8ampq0Lsh6ehfHCmqtmvdlOdteapp6FZUQ5T03ryELmk/k37+2jSYWWZmz7v72KD3o9fykqm4k1g7eap97zxnKu95ksqRqSzlSepdpvKSJ4lM5R2Zyp6fbr5bn/luQ2G04Vqdt+Upnbv9J7rhvz+oU99+S3uXnKTv/M6Vev68L2v7Z/Y1L3AmJqLiqFlhNDoa9TY1q1vMpIceiv+elUqqWf46yRQ9VIG0uoKVtBhZq27hPEzX2e1UzUCjVplKk6faiSrvmSJPSKMXmWp8X1bzJJEphFeaTMUUIz/dfLfO+vb1WvTh+5KkTx98U5/69vX6qaSx/7lLdz5xt4Y/itpOe3u/tjxxt26SpF/8Tbr7nZJ6oWrFUVzRVKkEnSa9ET1UfRR3xeE31/+dmv0rmKJu46SrZv2+ipH26l7er/6V4cqflK9MpcnTq1u+0vJ3MQ+ZkpTrPEnlyFSe8iQV4xyV9HOQqfwjU11kKqmHpkXbR1dfo6H35vbpo4WLNPTDe/XGtdfp0wffPOqvemPpyZIU2/bpQ/vje5riiqbR0fihe7VeqMBy2UNlZidKuk/SP5f0lqSb3P0/d/VNe7yIV6fiFiOLC0+rMbi9utIeF9Zuru4l9dChN4qeqTR5ktqbAjfrmfqPX/0seRqA4JnKUJ6k/JyjJDJVBEU/R0npM/X7O54+aqjc366+UI+9sEfP3fbneuSpBz5p+97/ukra9K3o74n7+ZPuS5LmF0zT09FzSapUNLPuRg2/N39fh947opl1N+rkg/ub/twnH9wvs+bH5JRD+9Pf79SqF2qAsjTL3w8kfSDpFEkVSfeY2erU363VIl5x0y4OQNJsJ91O1R63/kFcW9I0n90sxMbUsgORnUz1MW+tZg8qQqbI08CEy1Q7C02mzVTgvPXqHCWRqZLLzzkqcBbXXXym/uCVZ/XcPd/Q/7nrUj13zzf0B688q3UXn6k/fHW7tjxxt057e78WyD8ZKveHr27Xz7f8QLc9/v15bbc9/n39fMsPPulJqv/5P7r6mpbrMM2su3Fe75M0VzBJ0sJ9e5se7oX79urNpcubtr25dLksZgY9qxVBw8PzG+qLpomJqEfKLPpa3wNVqUi7dkmzs9HXDBRTUkaG/JnZYkkHJJ3l7r+qbntI0h53Xx/3vsSu35Ur03cZDuAKR7tX26S5Gxf/3SM/j+0yjrvaduflZ0tqPq631U2Ue6snr07/viKckPI2lCJTmZL6nrekIRFkKhvKkKnUedq16+grylJ7mUpqa5W3hLa4G8/T5unVLV9JfK9EpjqVp0zl7hwVOotS7DC6mXU3anjf0WtVzfzjFfp/hz/QaW8f3Sv0+pLlOnHxb8S+b9Ebe2VNPu+7mdylBU1SMyvTAp/V6586Ofbv3HfDpnn3UEnSkWOP0/++5bv63Mplmfus3alOMpWVguocSf/D3RfVbbte0hfd/dK49yUGa8GCXIzPbEfo8eBJbUknoyyOle+XPJ2opIxlSsrcxQ0yNXhlyFTqPM3OJn84lMLnTUr9gTPVLF/Ve2vj2iXFtpGp5vKUqdyco3bt6k0Wk9oSZrKbTSh+pPi2N5cuj72f6aNZjy2YTjv0pm796r/XDY/+2SeTS0jSzNBx+s7l1+nWR/5DbP4l5aJoSpLHguqfSvqxu3+6bts1kirufkHDa8cljUvSyMjIudPNfiGl5AAkTbuYFLraVcOM/HKkvTIoKdXJqJ3pqIsqTycqKWOZknpzcaNHPVtkqj+Kmqkgedq1K/nDoRQ+b1LwIu2nuw7EX73ecK2+den182YBk6IPajddEn0Yi2ubOu/LscXWuovP1HO3/bn+pP4eky9dpfNb3WNSAHnKVG7OUbOzvcliUltCTmc++ChV79XW37kyNk/Lhn8jsWCq3bcVm6kC6yRTWbmH6l1JSxq2LZH0TuML3X3C3cfcfWz58uZjNyUlj89MWjE5abrGdsa891HSmO+kseJJbUlj5RljnivZyVTavCUt4NejLJIpJGgrU0HyJKXPVNq8pW1LyOnn7v3uvGJKkhZ9+L4+d+93JUk3PffQvA9xkjT80fu66bmHEtu+9/HLuuvJ+feY3PXk3frexy/rspefaXr/yWUvP9O7e2wydE92juTjHFX/tdP3pm1L+DmGt96ljxbOP998tHCRhrfepR9ecrVmho6b1zYzdJx+eMnVmjrvy1p/ybV6fclyzcr0+pLlWl+9OPHZ9X+kTb/3x/PaNv3eH+uz6/9IUnRePH/Tt/TVm36k02/8W331ph+VopjqmLsP/CFpsaIbE8+o2/agpC1J7zv33HM90cMPu4+OuptFXx9+eG778LB79F9q9Bgennt9/fbaY3Q0uS1jtv3sdV+18e999MbHP3ms2vj3vu1nrye21d77hTv/wVfe+Lh/4c5/+GR7mUma8gxkpd1HpjKVNm9mzdtqf3dSFuP2swtkKqwyZCp1nmptaTLVi/Nb2pwmtbn7bEz7rFliW09+jl4c71b/xu38DnTw/1ieMpWbc1Q37+3V701M27afve5/etk6f23Jcv9Y5q8tWe5/etk6zlFd6CRTAw/VJzsi/UjSX1VDdp6kQ5JWJ72nZbCSpAldi5NDLz7EdSMpIISnM3k6UdUemclUUlsviq1WJ6pW+5qATIVThkx1lSf39B+2Q3+o7NVFyH4XcWkv0vSiSOvm3yNG3jKVi3NUt+8NWDC3g3NUWHktqE6U9Jikw5J2S/paq/d0fbKKE/dL3s1/nMi1vJ2oPGuZStLvnmSymgllyNRA8tRKVoq0bt6bpZ62bnrSA4+IyVumcnOOQmnlsqBK8+h7sNL+B4/cy9uJKu0jcyerXvQkD2C4II5WhkxlLk+90u8r/1nqaUtbpLl3NVyyGTIFhEVB1Utx/8G3858fH9RyqwwnKs/bySpNT7J798MFEUQZMpWrPOVNVnraurnQWvIeqjQPMoV+oqAaBIYZFVoZTlSetUyl1Spr3Q4X5KJIEGXIVCHyVCb9LNJatZfgHqo0DzKFfqKgGoRuPsQh88pwovKsZaobrT78pBkuyEWRoMqQqcLkCen1arKDJsgUEFYnmcrEwr5pJa6YPQhJCwYmLQw3O9vf/UTH8rRgYjcyl6leictq0sKQUm4W/c6DMmSqNHlCJpApIKw8LuxbDJVK9MFqdjb6Wv9hqtXCcSwMCPRPXFaTFobM0aLfAACgfyio+iXpgxofxoBsqFSkiYmo18ks+joxEW1PuiiyYYM0MzN/+8xMtF3iggkAAAVGQdUvSR/UWn0YA9A/9F4BAIAOUFD1U9wHtaQPYwCyoVe9VwAAINcoqLKA+6uAfAjdeyWRbwAAco6CKgu4vwrIt7S9V+QbAIDco6DKAu6vAvIvTe8V+QYAIPcoqLKC+6uAYkq6YMJwQAAAcm9o0DuAFkZGmi8mGjeMCED2VCrNF/lNyndtOGCtB6s2HLD2/QAAQCbQQ5V1ScOFJK5gA3nWzXBAsg8AQCZQUGVd0nAhbmgH8i3tcECyDwBAZlBQ5UHc/VXc0A7kX1y+WdsKAIBcoKDKMyasAIqrm7WtAABA31BQ5VmrBYEB5Ffata0k7q8CAKCPKKjyrNWEFQDyLc3aVtxfBQBAX1FQ5VnSFWyJq9RAUbEYOAAAmUFBlXdxV7C5Sg0UWzeLgXOxBQCAYCioioqr1EA5tXN/FRdbAAAIhoKqqJgFDCinVvdWcrEFAICgKKiKihkAgXJqdW9lq4stDAcEAKAjFFRFxQyAQHnF3V8lJV9sYTggAAAdo6AqKmYABNBM0sUWhgMCANAxCqoiYwZAAI2SLrZw7yUAAB2joCojrkID5RZ3saWdGQLp2QYAYB4KqjLiKjSAZpKGA9KzDQBAUxRUZcQMgACaSRoOSM82AABNUVCVETMAAogTNxyQnm0AAJqioCqjVjMAAkAj7q8CAKCpIAWVmV1rZlNm9r6ZPdCk/SIz22lmM2b2tJmN1rUdZ2b3m9nbZvaGmV0XYp/QQtI6NXwwGjgyhczJ+f1VZAoIi0wBc0L1UO2VdIek+xsbzOwkSY9KulnSiZKmJD1S95JbJZ0haVTShZJuMLNLAu0XOpWDD0YlQaaQLfm/v4pMAWGRKaAqSEHl7o+6+2OS/m+T5ssl7XD3H7v7e4pCtMbMVlXbr5R0u7sfcPdfSrpX0lUh9gsp5OODUeGRKWRSju+vIlNAWGQKmNOPe6hWS3qx9sTdD0v6taTVZrZM0qn17dU/r477ZmY2Xu1intq/f3+PdrnEcvDBCGQKGZP/mUODZYo8AZLIFEqmHwXV8ZIONWw7JOmEapsa2mttTbn7hLuPufvY8uXLg+4oVIQPRmVAppAt+Z85NFimyBMgiUyhZFoWVGb2jJl5zOO5Nv6OdyUtadi2RNI71TY1tNfaMAj5/2CUeWQKhTPgmUPJFBAWmQI607KgcvcL3N1iHue38XfskLSm9sTMFks6XdHY2gOS9tW3V/+8o7MfA8EwpXrPkSkUUtLMoT1GpoCwyBTQmVDTpg+Z2UJJx0g6xswWmtlQtXmbpLPMbG31NZskveTuO6vtD0raaGbLqjcrXiPpgRD7hZQG+MEIETIFhEWmgLDIFDAn1D1UGyUdkbRe0terf94oSe6+X9JaSZslHZD0eUlX1L33FkU3Kk5LelbSVnd/ItB+ITTWqOoXMgWERaaAsMgUUGXuPuh9SG1sbMynpqYGvRvlUVujqn5a9eHhUgwJNLPn3X1s0PvRa2QK/VKGTJEn9BOZAsLqJFP9mOUPRcEaVQAAAMA8FFRoH2tUAQAAAPNQUKF9rFEFAAAAzENBhfaxRhUAAAAwDwUV2scaVQAAAMA8Q61fAtSpVCigAAAAgCp6qBAOa1QBAACgZOihQhiNa1RNT0fPJXq0AAAAUFj0UCEM1qgCAABACVFQIQzWqAIAAEAJUVAhDNaoAgAAQAlRUCEM1qgCAABACVFQIQzWqAIAAEAJMcsfwmGNKgAAAJQMPVQAAAAAkBIFFfqHhX8BAABQMAz5Q3+w8C8AAAAKiB4q9AcL/wIAAKCAKKjQHyz8CwAAgAKioEJ/sPAvAAAACoiCCv3Bwr8AAAAoIAoq9AcL/wIAAKCAmOUP/cPCvwAAACgYeqgAAAAAICUKKmQDi/4CAAAghxjyh8Fj0V8AAADkFD1UGDwW/QUAAEBOUVBh8Fj0FwAAADlFQYXBY9FfAAAA5BQFFQaPRX8BAACQUxRUGDwW/QUAAEBOdV1QmdlxZnafmU2b2Ttm9oKZ/W7Day4ys51mNmNmT5vZaMP77zezt83sDTO7rtt9Qg5VKtKuXdLsbPS1xMUUmQLCIlNAWGQKmC9ED9WQpNckfVHSpyTdLOmvzWylJJnZSZIerW4/UdKUpEfq3n+rpDMkjUq6UNINZnZJgP0C8opMAWGRKSAsMgXU6bqgcvfD7n6ru+9y91l3f1zSq5LOrb7kckk73P3H7v6eohCtMbNV1fYrJd3u7gfc/ZeS7pV0Vbf7BeQVmQLCIlNAWGQKmC/4PVRmdoqk35K0o7pptaQXa+3ufljSryWtNrNlkk6tb6/+eXXo/UKOTU5KK1dKCxZEXycnB71HfUWmgLDIFBAWmULZBS2ozOxYSZOS/tLdd1Y3Hy/pUMNLD0k6odqmhvZaW9zfMW5mU2Y2tX///jA7juyanJTGx6Xpack9+jo+XpqiikwBYfU6U+QJZUOmgDYKKjN7xsw85vFc3esWSHpI0geSrq37Fu9KWtLwbZdIeqfapob2WltT7j7h7mPuPrZ8+fJWu4+827BBmpmZv21mJtqeU2QKCCtLmSJPKAIyBXSmZUHl7he4u8U8zpckMzNJ90k6RdJad/+w7lvskLSm9sTMFks6XdHY2gOS9tW3V/+8Q4Ak7d7d2fYcIFNAWGQKCItMAZ0JNeTvHkm/LelSdz/S0LZN0llmttbMFkraJOmlum7hByVtNLNl1ZsVr5H0QKD9Qt6NjHS2vTjIFBAWmQLCIlNAVYh1qEYl/StJn5X0hpm9W31UJMnd90taK2mzpAOSPi/pirpvcYuiGxWnJT0raau7P9HtfqEgNm+WhofnbxsejrYXFJkCwiJTQFhkCphvqNtv4O7TkqzFa/6bpFUxbe9L+mb1AcxXW+B3w4ZomN/ISFRMFXjhXzIFhEWmgLDIFDBf1wUV0HOVSqELKAAAAORX8HWoAAAAAKAsKKgAAAAAICUKKgAAAABIiYIK+TY5Ka1cKS1YEH2dnBz0HgEAAKBEmJQC+TU5KY2PSzMz0fPp6ei5xCQWAAAA6At6qJBfGzbMFVM1MzPRdgAAAKAPKKiQX7t3d7YdAAAACIyCCvk1MtLZdgAAACAwCirk1+bN0vDw/G3Dw9F2AAAAoA8oqJBflYo0MSGNjkpm0deJCSakAAAAQN8wyx/yrVKhgAIAAMDA0EMFAAAAAClRUAEAAABAShRUAAAAAJASBRWKa3JSWrlSWrAg+jo5Oeg9AgAAQMEwKQWKaXJSGh+XZmai59PT0XOJSSwAAAAQDD1UKKYNG+aKqZqZmWg7AAAAEAgFFYpp9+7OtgMAAAApUFChmEZGOtsOAAAApEBBhWLavFkaHp6/bXg42g4AAAAEQkGFYqpUpIkJaXRUMou+TkwwIQUAAACCYpY/FFelQgEFAACAnqKHCgAAAABSoqACAAAAgJQoqAAAAAAgJQoqAAAAAEiJggrlNDkprVwpLVgQfZ2cHPQeAQAAIIeY5Q/lMzkpjY9LMzPR8+np6LnErIAAAADoCD1UKJ8NG+aKqZqZmWg7AAAA0AEKKpTP7t2dbQcAAABiBCmozOxhM9tnZm+b2a/M7OqG9ovMbKeZzZjZ02Y2Wtd2nJndX33vG2Z2XYh9AmKNjHS2fQDIFBAWmQLCIlPAnFA9VHdKWunuSyT9vqQ7zOxcSTKzkyQ9KulmSSdKmpL0SN17b5V0hqRRSRdKusHMLgm0X8DRNm+WhofnbxsejrZnB5kCwiJTQFhkCqgKUlC5+w53f7/2tPo4vfr8ckk73P3H7v6eohCtMbNV1fYrJd3u7gfc/ZeS7pV0VYj9ApqqVKSJCWl0VDKLvk5MZGpCCjIFhEWmgLDIFDAn2D1UZvafzGxG0k5J+yT9pNq0WtKLtde5+2FJv5a02syWSTq1vr3659Wh9gtoqlKRdu2SZmejrxkqpmrIFBAWmQLCIlNAJNi06e7+b8zs30r6J5IukFS7anG8pP0NLz8k6YRqW+15Y1tTZjYuqTrHtd41s1fa2L2TJL3VxuvKiGMTr/7YjCa9sBfIVG5xbOIVPlPkqSc4PvHIVHP8zsTj2CRLlamWBZWZPSPpizHN2939/NoTd/9Y0nNm9nVJ/1rS9yW9K2lJw/uWSHqn2lZ7/l5DW1PuPiFpotV+N/wMU+4+1sl7yoJjE69Xx4ZMFRvHJl4ZMkWewuP4xCNTsfvP70wMjk2ytMen5ZA/d7/A3S3mcX7M24Y0N452h6Q1dTu6uNq2w90PKOoiXlP33jXV9wCFRKaAsMgUEBaZAjrT9T1UZnaymV1hZseb2TFmdrGkfynpqepLtkk6y8zWmtlCSZskveTuO6vtD0raaGbLqjcrXiPpgW73C8grMgWERaaAsMgUMF+ISSlcURfv65IOSPqupD9x9/8iSe6+X9JaSZur7Z+XdEXd+29RdKPitKRnJW119ycC7Fe9jrqKS4ZjE29Qx4ZM5RvHJh6Zao7fmWQcn3hkqjl+Z+JxbJKlOj7m7qF3BAAAAABKIdi06QAAAABQNhRUAAAAAJBSoQsqMzvRzLaZ2WEzmzazrw16nwbFzK41sykze9/MHmhou8jMdprZjJk9bWZ9X8tikMzsODO7r/o78o6ZvWBmv1vXXurjU49MzSFTzZGn9pGnOeQpHplqH5maQ6bi9SJThS6oJP1A0geSTpFUkXSPmZV1Je69ku6QdH/9RjM7SdKjkm6WdKKkKUmP9H3vBmtI0muK1tz4lKJj8ddmtpLjcxQyNYdMNUee2kee5pCneGSqfWRqDpmKFzxThZ2UwqI1Dw5IOsvdf1Xd9pCkPe6+fqA7N0Bmdoek09z9qurzcUlXufsXqs8XK1oh+py66U1Lx8xekvRtSf9IHB9JZCoOmWqNPB2NPDVHntpDpo5GppojU+3pNlNF7qH6LUkf10JV9aKksl6piLNa0XGRJLn7YUVTmZb2OJnZKZhPqP0AAAH6SURBVIp+f3aI41OPTLWH35k65CkWeWoPvzMNyFQsMtUefmcahMhUkQuq4yUdath2SNIJA9iXLOM41TGzYyVNSvrL6pUIjs8cjkV7OE5V5CkRx6I9HKc6ZCoRx6I9HKc6oTJV5ILqXUlLGrYtkfTOAPYlyzhOVWa2QNJDisZfX1vdzPGZw7FoD8dJ5KkNHIv2cJyqyFRLHIv2cJyqQmaqyAXVryQNmdkZddvWKOrOw5wdio6LpE/Gip6ukh0nMzNJ9ym6kXWtu39YbeL4zCFT7Sn97wx5agt5ag+/MyJTbSJT7eF3RuEzVdiCqjrm8VFJt5nZYjM7T9K/UFSJlo6ZDZnZQknHSDrGzBaa2ZCkbZLOMrO11fZNkl4q4Y2J90j6bUmXuvuRuu0cnyoyNR+ZSkSeWiBP85GnlshUC2RqPjLVUthMuXthH4qmO3xM0mFJuyV9bdD7NMBjcaskb3jcWm37Z5J2Sjoi6RlJKwe9v30+NqPV4/Geoq7e2qPC8TnqWJGpuWNBppofF/LU/rEiT3PHgjzFHxsy1f6xIlNzx4JMxR+b4Jkq7LTpAAAAANBrhR3yBwAAAAC9RkEFAAAAAClRUAEAAABAShRUAAAAAJASBRUAAAAApERBBQAAAAApUVABAAAAQEoUVAAAAACQEgUVAAAAAKT0/wFx0txpoz32QAAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "_,axs = plt.subplots(1,4,figsize=(12,3))\n", + "for ax in axs: show_preds(apply_step(params, False), ax)\n", + "plt.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Step 7 is to *stop*. We just decided to stop after 10 epochs arbitrarily. In practice, we watch the training and validation losses and our metrics to decide when to stop, as we've discussed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Summarizing gradient descent" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To summarize, at the beginning, the weights of our model can be random (training *from scratch*) or come from of a pretrained model (*transfer learning*). In the first case, the output we will get from our inputs won't have anything to do with what we want, and even in the second case, it's very likely the pretrained model won't be very good at the speficic task we are targetting. So the model will need to *learn* better weights.\n", + "\n", + "To do this, we will compare the outputs the model gives us with our targets (we have labelled data, so we know what result the model should give) using a *loss function*, which returns a number that needs to be as low as possible. Our weights need to be improved. To do this, we take a few data items (such as images) that we feed to our model. After going through our model, we compare to the corresponding targets using our loss function. The score we get tells us how wrong our predictions were, and we will change the weights a little bit to make it slightly better.\n", + "\n", + "To find how to change the weights to make the loss a bit better, we use calculus to calculate the *gradient* (actually, we let PyTorch do it for us!) Let's imagine you are lost in the mountains with your car parked at the lowest point. To find your way, you might wander in a random direction but that probably won't help much. Since you know you your vehicle is at the lowest point, you would be better to go downhill. By always taking a step in the direction of the steepest slope, you should eventually arrive at your destination. We use the gradient to tell us how big a step to take; specifically, we multiply the gradient by a number we choose called the *learning rate* to decide on the step size." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## MNIST loss function" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's get back to our MNIST problem. As we've seen, if we are going to calculate gradients (which we need), then we need some *loss function* that represents how good our model is. The obvious approach would be to use the accuracy for this purpose. In this case, we would calculate our prediction for each image, and then calculate the overall accuracy (remember, at first we simply use random weights), and then calculate the gradients of each weight with respect to that accuracy calculation.\n", "\n", "Unfortunately, we have a significant technical problem here. The gradient of a function is its *slope*, or its steepness, which can be defined as *rise over run* -- that is, how much the value of function goes up or down, divided by how much you changed the input. We can write this in maths: `(y_new-y_old) / (x_new-x_old)`. Specifically, it is defined when x_new is very similar to x_old, meaning that their difference is very small. But accuracy only changes at all when a prediction changes from a 3 to a 7, or vice versa. So the problem is that a small change in weights from from x_old to x_new isn't likely to cause any prediction to change, so `(y_new - y_old)` will be zero. (In other words, the gradient is zero almost everywhere.) As a result, a very small change in the value of a weight will often not actually change the accuracy at all. This means it is not useful to use accuracy as a loss function. When we use accuracy as a loss function, most of the time our gradients will actually be zero, and the model will not be able to learn from that number. That is not much use at all!\n", "\n", @@ -2964,102 +3522,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Sidebar: loss versus metric" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We now have two terms which are somewhat similar: loss and metric. They are similar because they are both measures of how well your model is performing. The key difference, though, is that the loss must be a function which has a meaningful derivative. It can't have big flat sections, and large jumps, but instead must be reasonably smooth. Therefore, sometimes it does not really reflect exactly what we are trying to achieve, but is something that is a compromise between our real goal, and a function that can be optimised using its gradient. The loss function is calculated for each item in our dataset, and then at the end of an epoch these are all averaged, and the overall mean loss is reported for the epoch.\n", + "We now have two terms which are somewhat similar: *loss* and *metric*. They are similar because they are both measures of how well your model is performing. The key difference, though, is that the loss must be a function which has a meaningful derivative. It can't have big flat sections, and large jumps, but instead must be reasonably smooth. Therefore, sometimes it does not really reflect exactly what we are trying to achieve, but is something that is a compromise between our real goal, and a function that can be optimised using its gradient. The loss function is calculated for each item in our dataset, and then at the end of an epoch these are all averaged, and the overall mean loss is reported for the epoch.\n", "\n", "Metrics, on the other hand, are the numbers that we really care about. These are the things which are printed at the end of each epoch, and tell us how our model is really doing. It is important that we learn to focus on these metrics, rather than the loss, when judging the performance of a model." ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### End sidebar" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Stepping with a learning rate" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The gradient only tells us the slope of our function, it doesn't actually tell us how far to adjust the parameters. It gives us some idea of how far to adjust them; if the slope is very large, then that may suggest that we have more adjustments to do, whereas if the slope is very small, that may suggest that we are close to the optimal value.\n", - "\n", - "Deciding how to change our parameters based on the value of the gradients is an important part of the deep learning process. Nearly all approaches start with the basic idea of multiplying the gradient by some small number, called the *learning rate* (LR). The learning rate is often a number between 0.001 and 0.1, although it could be anything. Often, people select a learning rate just by trying a few, and finding which results in the best model after training (we'll show you a better approach later in this book, called the *learning rate finder*). Once you've picked a learning rate, you can adjust your parameters using this simple function:\n", - "\n", - "```\n", - "w -= gradient(w) * lr\n", - "```\n", - "\n", - "This is known as *stepping* your parameters, using a *optimiser step*.\n", - "\n", - "If you pick a learning rate that's too low, it can mean having to do for a lot of steps:" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\"An" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Although picking a learning rate that's too high is even worse--it can actually result in the loss getting *worse*!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\"An" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the learning rate is too high, it may also \"bounce\" around, rather than actually diverging; this has the result of taking many steps to train successfully:" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\"An" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Summarizing gradient descent" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To summarize, at the beginning, the weights of our model can be random (training *from scratch*) or come from of a pretrained model (*transfer learning*). In the first case, the output we will get from our inputs won't have anything to do with what we want, and even in the second case, it's very likely the pretrained model won't be very good at the speficic task we are targetting. So the model will need to *learn* better weights.\n", - "\n", - "To do this, we will compare the outputs the model gives us with our targets (we have labelled data, so we know what result the model should give) using a *loss function*, which returns a number that needs to be as low as possible. Our weights need to be improved. To do this, we take a few data items (such as images) that we feed to our model. After going through our model, we compare to the corresponding targets using our loss function. The score we get tells us how wrong our predictions were, and we will change the weights a little bit to make it slightly better.\n", - "\n", - "To find how to change the weights to make the loss a bit better, we use calculus to calculate the *gradient* (actually, we let PyTorch do it for us!) Let's imagine you are lost in the mountains with your car parked at the lowest point. To find your way, you might wander in a random direction but that probably won't help much. Since you know you your vehicle is at the lowest point, you would be better to go downhill. By always taking a step in the direction of the steepest slope, you should eventually arrive at your destination. We use the gradient to tell us how big a step to take; specifically, we multiply the gradient by a number we choose called the *learning rate* to decide on the step size." - ] - }, { "cell_type": "markdown", "metadata": {}, @@ -4874,6 +5341,18 @@ "display_name": "Python 3", "language": "python", "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.6" } }, "nbformat": 4, diff --git a/10_nlp.ipynb b/10_nlp.ipynb index 58f04e1..b01c5fb 100644 --- a/10_nlp.ipynb +++ b/10_nlp.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -113,18 +113,14 @@ }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Tokenization" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "When we said, *convert the text into a list of words*, we left out a lot of details. For instance, what do we do with punctuation? How do we deal with a word like \"don't\"? Is it one word, or two? What about long medical or chemical words? Should they be split into their separate pieces of meaning? How about hyphenated words? What about languages like German and Poland where we can create really long words from many, many pieces? What about languages like Japanese and Chinese which don't use bases at all, and don't really have a well-defined idea of *word*?\n", "\n", @@ -139,27 +135,21 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "> jargon: token: one element of a list created by the tokenisation process. It could be a word, part of a word (a _subword_), or a single character." ] }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Word tokenization with fastai" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Rather than providing its own tokenizers, fastai instead provides a consistent interface to a range of tokenisers in external libraries. Tokenization is an active field of research, and new and improved tokenizers are coming out all the time, so the defaults that fastai uses change too. However, the API and options shouldn't change too much, since fastai tries to maintain a consistent API even as the underlying technology changes.\n", "\n", @@ -168,10 +158,8 @@ }, { "cell_type": "code", - "execution_count": 3, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "from fastai2.text.all import *\n", @@ -180,19 +168,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "We'll need to grab the text files in order to try out a tokenizer. Just like `get_image_files`, which we've used many times already, gets all the image files in a path, `get_text_files` gets all the text files in a path. We can also optionally pass `folders` to restrict the search to a particular list of subfolders:" ] }, { "cell_type": "code", - "execution_count": 4, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "files = get_text_files(path, folders = ['train', 'test', 'unsup'])" @@ -200,19 +184,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Here's a review that we'll tokenize (we'll just print the start of it here to save space):" ] }, { "cell_type": "code", - "execution_count": 5, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -220,7 +200,7 @@ "'This movie, which I just discovered at the video store, has apparently sit '" ] }, - "execution_count": 5, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -231,9 +211,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "As we write this book, the default *English word tokenizer* for fastai uses a library called *spaCy*. This uses a sophisticated rules engine that has special rules for URLs, individual special English words, and much more. Rather than directly using `SpacyTokenizer`, however, we'll use `WordTokenizer`, since that will always point to fastai's current default word tokenizer (which may not always be Spacy, depending when you're reading this).\n", "\n", @@ -242,10 +220,8 @@ }, { "cell_type": "code", - "execution_count": 6, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "name": "stdout", @@ -263,19 +239,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "As you see, spaCy has mainly just separated out the words and punctuation. But it does something else here too: it has split \"it's\" into \"it\" and \"'s\". That makes intuitive sense--these are separate words, really. Tokenization is a surprisingly subtle task, when you think about all the little details that have to be handled. spaCy handles these for us, for instance, here we see that \".\" is separated when it terminates a sentence, but not in an acronym or number:" ] }, { "cell_type": "code", - "execution_count": 7, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -283,7 +255,7 @@ "(#9) ['The','U.S.','dollar','$','1','is','$','1.00','.']" ] }, - "execution_count": 7, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -294,19 +266,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "fastai then adds some additional functionality to the tokenization process with the `Tokenizer` class:" ] }, { "cell_type": "code", - "execution_count": 8, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "name": "stdout", @@ -323,9 +291,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "There are now some tokens added that start with the characters \"xx\", which is not a common word prefix in English. These are *special tokens*.\n", "\n", @@ -346,10 +312,8 @@ }, { "cell_type": "code", - "execution_count": 9, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -364,7 +328,7 @@ " ]" ] }, - "execution_count": 9, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -375,9 +339,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "As always, you can look at the source code of each of them in a notebook by typing\n", "\n", @@ -393,25 +355,21 @@ "- `spec_add_spaces`: add spaces around / and # ;\n", "- `rm_useless_spaces`: remove all repetitions of the space character ;\n", "- `replace_all_caps`: lowercase a word written in all caps and adds a special token for all caps (xxcap) in front of it ;\n", - "- `replace_maj`: lowercase a capilaized word and adds a special token for capitalized (xxmaj) in front of it ;\n", + "- `replace_maj`: lowercase a capitalized word and adds a special token for capitalized (xxmaj) in front of it ;\n", "- `lowercase`: lowercase all text and adds a special token at the beginning (xxbos) and/or the end (xxeos)." ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Let's take a look at a few of them in action:" ] }, { "cell_type": "code", - "execution_count": 10, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -419,7 +377,7 @@ "\"(#11) ['xxbos','©','xxmaj','fast.ai','xxrep','3','w','.fast.ai','/','xxup','index'...]\"" ] }, - "execution_count": 10, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -430,24 +388,20 @@ }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Subword tokenization" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ - "In addition to the *word tokenization* approach seen in the last section, another popular tokenization method is *subword tokenization*. Word tokenization relies on an assumption that spaces provide a useful separation of components of meaning in a sentence. However, this assumption is not always appropriate. For instance, consider this sentence: 我的名字是郝杰瑞 (which means \"My name is Jeremy Howard\" in Chinese). That's not going to work very well with a word tokenizer, because there are no spaces in it! Languages like Chinese and Japanese don't use spaces, and in fact they don't even have a well-defined concept of a \"word\". There are also \"agglutinative languages\", like Polish, which can add many morphemes together to create very long \"words\" which include a lot of separate pieces of information.\n", + "In addition to the *word tokenization* approach seen in the last section, another popular tokenization method is *subword tokenization*. Word tokenization relies on an assumption that spaces provide a useful separation of components of meaning in a sentence. However, this assumption is not always appropriate. For instance, consider this sentence: 我的名字是郝杰瑞 (which means \"My name is Jeremy Howard\" in Chinese). That's not going to work very well with a word tokenizer, because there are no spaces in it! Languages like Chinese and Japanese don't use spaces, and in fact they don't even have a well-defined concept of a \"word\". There are also languages, like Turkish and Hungarian, which can add many bits together without spaces, to create very long words which include a lot of separate pieces of information.\n", "\n", "To handle these cases, it's generally best to use subword tokenization. This proceeds in two steps:\n", "\n", - "1. Analyze a corpus of documents to find the most commonly occuring groups of letters. These become the vocab.\n", + "1. Analyze a corpus of documents to find the most commonly occurring groups of letters. These become the vocab.\n", "2. Tokenize the corpus using this vocab of *subword units*.\n", "\n", "Let's look at an example. For our corpus, we'll use the first 2000 movie reviews:" @@ -455,10 +409,8 @@ }, { "cell_type": "code", - "execution_count": 19, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "txts = L(o.open().read() for o in files[:2000])" @@ -466,19 +418,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "We instantiate our tokenizer, passing in the size of the vocab we want to create, and then we need to \"train\" it. That is, we need to have it read our documents, and find the common sequences of characters, to create the vocab. This is done with `setup`. As we'll see shortly, `setup` is a special fastai method that is called automatically in our usual data processing pipelines. Since we're doing everything manually at the moment, however, we have to call it ourselves. Here's a function that does these steps for a given vocab size, and shows an example output:" ] }, { "cell_type": "code", - "execution_count": 20, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "def subword(sz):\n", @@ -489,19 +437,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Let's try it out:" ] }, { "cell_type": "code", - "execution_count": 21, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -519,7 +463,7 @@ "'▁This ▁movie , ▁which ▁I ▁just ▁dis c over ed ▁at ▁the ▁video ▁st or e , ▁has ▁a p par ent ly ▁s it ▁around ▁for ▁a ▁couple ▁of ▁years ▁without ▁a ▁dis t ri but or . ▁It'" ] }, - "execution_count": 21, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -530,9 +474,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "When using fastai's subword tokenizer, the special character `▁` represents a space character in the original text.\n", "\n", @@ -541,10 +483,8 @@ }, { "cell_type": "code", - "execution_count": 22, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -562,7 +502,7 @@ "'▁ T h i s ▁movie , ▁w h i ch ▁I ▁ j us t ▁ d i s c o ver ed ▁a t ▁the ▁ v id e o ▁ st or e , ▁h a s'" ] }, - "execution_count": 22, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -573,19 +513,15 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "On the other hand, if we use a larger vocab, then most common English words will end up in the vocab themselves, and we will not need as many to represent a sentence:" ] }, { "cell_type": "code", - "execution_count": 23, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -603,7 +539,7 @@ "\"▁This ▁movie , ▁which ▁I ▁just ▁discover ed ▁at ▁the ▁video ▁store , ▁has ▁apparently ▁sit ▁around ▁for ▁a ▁couple ▁of ▁years ▁without ▁a ▁distributor . ▁It ' s ▁easy ▁to ▁see ▁why . ▁The ▁story ▁of ▁two ▁friends ▁living\"" ] }, - "execution_count": 23, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -614,9 +550,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Picking a subword vocab size represents a compromise: a larger vocab means more fewer tokens per sentence, which means faster training, less memory, and less state for the model to remember; but on the downside, it means larger embedding matrices, which require more data to learn.\n", "\n", @@ -644,7 +578,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -669,7 +603,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -678,7 +612,7 @@ "(#228) ['xxbos','xxmaj','this','movie',',','which','i','just','discovered','at'...]" ] }, - "execution_count": 36, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -697,7 +631,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -706,7 +640,7 @@ "\"(#2000) ['xxunk','xxpad','xxbos','xxeos','xxfld','xxrep','xxwrep','xxup','xxmaj','the','.',',','a','and','of','to','is','in','i','it'...]\"" ] }, - "execution_count": 37, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -730,7 +664,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -739,7 +673,7 @@ "tensor([ 2, 8, 21, 28, 11, 90, 18, 59, 0, 45, 9, 351, 499, 11, 72, 533, 584, 146, 29, 12])" ] }, - "execution_count": 40, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -757,7 +691,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -766,7 +700,7 @@ "'xxbos xxmaj this movie , which i just xxunk at the video store , has apparently sit around for a'" ] }, - "execution_count": 42, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -801,7 +735,7 @@ }, { "cell_type": "code", - "execution_count": 68, + "execution_count": null, "metadata": { "hide_input": true }, @@ -1189,18 +1123,18 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Going back to our dataset, the first step is to transform the individual texts into a stream by concatenating them together. As with images, it's best to randomize the order in which the inputs come, so at the beginnaing of each epoch we will shuffle the entries to make a new stream (we shuffle the order of the documents, not the order of the words inside, otherwise the text would not make sense anymore).\n", + "Going back to our dataset, the first step is to transform the individual texts into a stream by concatenating them together. As with images, it's best to randomize the order in which the inputs come, so at the beginning of each epoch we will shuffle the entries to make a new stream (we shuffle the order of the documents, not the order of the words inside, otherwise the text would not make sense anymore).\n", "\n", "We will then cut this stream into a certain number of batches (which is our *batch size*). For instance, if the stream has 50,000 tokens and we set a batch size of 10, this will give us 10 mini-streams of 5,000 tokens. What is important is that we preserve the order of the tokens (so from 1 to 5,000 for the first mini-stream, then from 5,001 to 10,000...) because we want the model to read continuous rows of text (as in our example above). This is why each text has been added a `xxbos` token during preprocessing, so that the model knows when it reads the stream we are beginning a new entry.\n", "\n", - "So to recap, at every epoch we shuffle our collection of documents to pick one docment, and then we transform that one into a stream of tokens. We then cut that stream into a batch of fixed-size consecutive mini-streams. Our model will then read the mini-streams in order, and thanks to an inner state, it will produce the same activation whatever sequence length you picked.\n", + "So to recap, at every epoch we shuffle our collection of documents to pick one document, and then we transform that one into a stream of tokens. We then cut that stream into a batch of fixed-size consecutive mini-streams. Our model will then read the mini-streams in order, and thanks to an inner state, it will produce the same activation whatever sequence length you picked.\n", "\n", "This is all done behind the scenes by the fastai library when we create a `LMDataLoader`. We can create one by first applying our `Numericalize` object to the tokenized texts:" ] }, { "cell_type": "code", - "execution_count": 44, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -1216,7 +1150,7 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -1232,7 +1166,7 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1241,7 +1175,7 @@ "(torch.Size([64, 72]), torch.Size([64, 72]))" ] }, - "execution_count": 51, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -1260,7 +1194,7 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1269,7 +1203,7 @@ "'xxbos xxmaj this movie , which i just xxunk at the video store , has apparently sit around for a'" ] }, - "execution_count": 49, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -1287,7 +1221,7 @@ }, { "cell_type": "code", - "execution_count": 50, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1296,7 +1230,7 @@ "'xxmaj this movie , which i just xxunk at the video store , has apparently sit around for a couple'" ] }, - "execution_count": 50, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -1314,18 +1248,14 @@ }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Language model using DataBlock" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "fastai handles tokenization and numericalization automatically when `TextBlock` is passed to `DataBlock`. All of the arguments that can be passed to `Tokenize` and `Numericalize` can also be passed to `TextBlock`. In the next chapter we'll discuss the easiest ways to run each of these steps separately, to ease debugging--but you can always just debug by running them manually on a subset of your data as shown in the previous sections. And don't forget about `DataBlock`'s handy `summary` method, which is very useful for debugging data issues.\n", "\n", @@ -1334,10 +1264,8 @@ }, { "cell_type": "code", - "execution_count": 54, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "get_imdb = partial(get_text_files, folders=['train', 'test', 'unsup'])\n", @@ -1350,9 +1278,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "One thing that's different to previous types used in `DataBlock` is that we're not just using the class directly (i.e. `TextBlock(...)`, but instead are calling a *class method*. A class method is a Python method which, as the name suggests, belongs to a *class* rather than an *object*. (Be sure to search online for more information about class methods if you're not familiar with them, since they're commonly used in many Python libraries and applications; we've used them a few times previously in the book, but haven't called attention to them.) The reason that `TextBlock` is special is that setting up the numericalizer's vocab can take a long time (we have to read every document and tokenize it to get the vocab); to be as efficient as possible fastai does things such as: \n", "\n", @@ -1366,10 +1292,8 @@ }, { "cell_type": "code", - "execution_count": 58, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [ { "data": { @@ -1410,28 +1334,22 @@ }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Fine tuning the language model" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "For converting the integer word indices into activations that we can use for our neural network, we will use embeddings, just like we did for collaborative filtering and tabular modelling. Then those embeddings are fed in a *Recurrent Neural Network* (RNN), using an architecture called *AWD_LSTM* (we will show how to write such a model from scratch in <>). As we discussed earlier, the embeddings in the pretrained model are merged with random embeddings added for words that weren't in the pretraining vocabulary. This is handled automatically inside `language_model_learner`:" ] }, { "cell_type": "code", - "execution_count": 59, - "metadata": { - "hidden": true - }, + "execution_count": null, + "metadata": {}, "outputs": [], "source": [ "learn = language_model_learner(\n", @@ -1441,9 +1359,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "The loss function used by default is cross entropy loss, since we essentially have a classification problem (the different categories being the words in our vocab). A metric often used in NLP for language models is called *perplexity*. It is the exponential of the loss (i.e. `torch.exp(cross_entropy)`). We will also add accuracy, to see how many times our model is right when trying to predict the next word, since cross entropy (as we've seen) is both hard to interpret, and also tells you more about the model's confidence, rather than just its accuracy\n", "\n", @@ -1452,18 +1368,14 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "\"Diagram" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll just use `fit_one_cycle`. Just like `cnn_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (which is the only part of the model that contains randomly initialized weights--i.e. embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):" ] @@ -1471,9 +1383,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "hidden": true - }, + "metadata": {}, "outputs": [ { "data": { @@ -1679,7 +1589,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Once this is done, we save all of our model except the final layer that converts activations to probabilities of picking each token in our vocabulary. The model not without the final layer is called the *encoder*. We can save it with `save_encoder`:" + "Once this is done, we save all of our model except the final layer that converts activations to probabilities of picking each token in our vocabulary. The model not including the final layer is called the *encoder*. We can save it with `save_encoder`:" ] }, { @@ -1707,18 +1617,14 @@ }, { "cell_type": "markdown", - "metadata": { - "heading_collapsed": true - }, + "metadata": {}, "source": [ "### Text generation" ] }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "Before using this to fine-tune a classifier on the review, we can use our model to generate random reviews: since it's trained to guess what the next word of the sentence is, we can use it to write new reviews:" ] @@ -1726,9 +1632,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "hidden": true - }, + "metadata": {}, "outputs": [ { "data": { @@ -1761,9 +1665,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "hidden": true - }, + "metadata": {}, "outputs": [ { "name": "stdout", @@ -1780,9 +1682,7 @@ }, { "cell_type": "markdown", - "metadata": { - "hidden": true - }, + "metadata": {}, "source": [ "As you can see, we add some randomness (we pick a random word based on the probabilities returned by the model) so you don't get exactly the same review twice. Our model doesn't have any programmed knowledge of the structure of a sentence or grammar rules, yet it has clearly learned a lot about English sentences: we can see it capitalized properly (I is just transformed to i with our rules -- they require two characters or more to consider a word is capitalized -- so it's normal to see it lowercased), and is using consistent tense. The general review make sense at first glance, and it's only if you read carefully you can notice something is a bit off. Not bad for a model trained in a couple of hours! \n", "\n", @@ -1829,9 +1729,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "scrolled": true - }, + "metadata": {}, "outputs": [ { "data": { @@ -1891,7 +1789,7 @@ }, { "cell_type": "code", - "execution_count": 66, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -1907,7 +1805,7 @@ }, { "cell_type": "code", - "execution_count": 67, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1916,7 +1814,7 @@ "(#10) [228,238,121,290,196,194,533,124,581,155]" ] }, - "execution_count": 67, + "execution_count": null, "metadata": {}, "output_type": "execute_result" } @@ -2340,10 +2238,7 @@ }, "toc": { "base_numbering": 1, - "nav_menu": { - "height": "367.997px", - "width": "278.999px" - }, + "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true,