mirror of
https://github.com/fastai/fastbook.git
synced 2025-04-04 01:40:44 +00:00
vision_learner
This commit is contained in:
parent
b7f756b49d
commit
9f9e59790c
@ -576,7 +576,7 @@
|
||||
" path, get_image_files(path), valid_pct=0.2, seed=42,\n",
|
||||
" label_func=is_cat, item_tfms=Resize(224))\n",
|
||||
"\n",
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1)"
|
||||
]
|
||||
},
|
||||
@ -1580,7 +1580,7 @@
|
||||
"The fifth line of the code training our image recognizer tells fastai to create a *convolutional neural network* (CNN) and specifies what *architecture* to use (i.e. what kind of model to create), what data we want to train it on, and what *metric* to use:\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Why a CNN? It's the current state-of-the-art approach to creating computer vision models. We'll be learning all about how CNNs work in this book. Their structure is inspired by how the human vision system works.\n",
|
||||
@ -1596,9 +1596,9 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"`cnn_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
|
||||
"`vision_learner` also has a parameter `pretrained`, which defaults to `True` (so it's used in this case, even though we haven't specified it), which sets the weights in your model to values that have already been trained by experts to recognize a thousand different categories across 1.3 million photos (using the famous [*ImageNet* dataset](http://www.image-net.org/)). A model that has weights that have already been trained on some other dataset is called a *pretrained model*. You should nearly always use a pretrained model, because it means that your model, before you've even shown it any of your data, is already very capable. And, as you'll see, in a deep learning model many of these capabilities are things you'll need, almost regardless of the details of your project. For instance, parts of pretrained models will handle edge, gradient, and color detection, which are needed for many tasks.\n",
|
||||
"\n",
|
||||
"When using a pretrained model, `cnn_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
|
||||
"When using a pretrained model, `vision_learner` will remove the last layer, since that is always specifically customized to the original training task (i.e. ImageNet dataset classification), and replace it with one or more new layers with randomized weights, of an appropriate size for the dataset you are working with. This last part of the model is known as the *head*.\n",
|
||||
"\n",
|
||||
"Using pretrained models is the *most* important method we have to allow us to train more accurate models, more quickly, with less data, and less time and money. You might think that would mean that using pretrained models would be the most studied area in academic deep learning... but you'd be very, very wrong! The importance of pretrained models is generally not recognized or discussed in most courses, books, or software library features, and is rarely considered in academic papers. As we write this at the start of 2020, things are just starting to change, but it's likely to take a while. So be careful: most people you speak to will probably greatly underestimate what you can do in deep learning with few resources, because they probably won't deeply understand how to use pretrained models.\n",
|
||||
"\n",
|
||||
@ -2914,9 +2914,21 @@
|
||||
"split_at_heading": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -1063,7 +1063,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet18, metrics=error_rate)\n",
|
||||
"learn.fine_tune(4)"
|
||||
]
|
||||
},
|
||||
|
@ -4984,7 +4984,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"To create a `Learner` without using an application (such as `cnn_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
|
||||
"To create a `Learner` without using an application (such as `vision_learner`) we need to pass in all the elements that we've created in this chapter: the `DataLoaders`, the model, the optimization function (which will be passed the parameters), the loss function, and optionally any metrics to print:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -5706,7 +5706,7 @@
|
||||
],
|
||||
"source": [
|
||||
"dls = ImageDataLoaders.from_folder(path)\n",
|
||||
"learn = cnn_learner(dls, resnet18, pretrained=False,\n",
|
||||
"learn = vision_learner(dls, resnet18, pretrained=False,\n",
|
||||
" loss_func=F.cross_entropy, metrics=accuracy)\n",
|
||||
"learn.fit_one_cycle(1, 0.1)"
|
||||
]
|
||||
|
@ -610,7 +610,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(2)"
|
||||
]
|
||||
},
|
||||
@ -1774,7 +1774,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1, base_lr=0.1)"
|
||||
]
|
||||
},
|
||||
@ -1821,7 +1821,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"lr_min,lr_steep = learn.lr_find(suggest_funcs=(minimum, steep))"
|
||||
]
|
||||
},
|
||||
@ -1927,7 +1927,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(2, base_lr=3e-3)"
|
||||
]
|
||||
},
|
||||
@ -2053,7 +2053,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fit_one_cycle(3, 3e-3)"
|
||||
]
|
||||
},
|
||||
@ -2406,7 +2406,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fit_one_cycle(3, 3e-3)\n",
|
||||
"learn.unfreeze()\n",
|
||||
"learn.fit_one_cycle(12, lr_max=slice(1e-6,1e-4))"
|
||||
@ -2626,7 +2626,7 @@
|
||||
],
|
||||
"source": [
|
||||
"from fastai.callback.fp16 import *\n",
|
||||
"learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
|
||||
"learn = vision_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
|
||||
"learn.fine_tune(6, freeze_epochs=3)"
|
||||
]
|
||||
},
|
||||
|
@ -885,7 +885,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `cnn_learner` to create a `Learner`, so we can look at its activations:"
|
||||
"Now we'll create our `Learner`. We saw in <<chapter_mnist_basics>> that a `Learner` object contains four main things: the model, a `DataLoaders` object, an `Optimizer`, and the loss function to use. We already have our `DataLoaders`, we can leverage fastai's `resnet` models (which we'll learn how to create from scratch later), and we know how to create an `SGD` optimizer. So let's focus on ensuring we have a suitable loss function. To do this, let's use `vision_learner` to create a `Learner`, so we can look at its activations:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -894,7 +894,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18)"
|
||||
"learn = vision_learner(dls, resnet18)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1225,7 +1225,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
|
||||
"learn = vision_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
|
||||
"learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)"
|
||||
]
|
||||
},
|
||||
@ -1782,7 +1782,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"As usual, we can use `cnn_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
|
||||
"As usual, we can use `vision_learner` to create our `Learner`. Remember way back in <<chapter_intro>> how we used `y_range` to tell fastai the range of our targets? We'll do the same here (coordinates in fastai and PyTorch are always rescaled between -1 and +1):"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1791,7 +1791,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18, y_range=(-1,1))"
|
||||
"learn = vision_learner(dls, resnet18, y_range=(-1,1))"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1880,7 +1880,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `cnn_learner` using the `loss_func` parameter.\n",
|
||||
"This makes sense, since when coordinates are used as the dependent variable, most of the time we're likely to be trying to predict something as close as possible; that's basically what `MSELoss` (mean squared error loss) does. If you want to use a different loss function, you can pass it to `vision_learner` using the `loss_func` parameter.\n",
|
||||
"\n",
|
||||
"Note also that we didn't specify any metrics. That's because the MSE is already a useful metric for this task (although it's probably more interpretable after we take the square root). \n",
|
||||
"\n",
|
||||
|
@ -371,7 +371,7 @@
|
||||
"\n",
|
||||
"This means that when you distribute a model, you need to also distribute the statistics used for normalization, since anyone using it for inference, or transfer learning, will need to use the same statistics. By the same token, if you're using a model that someone else has trained, make sure you find out what normalization statistics they used, and match them.\n",
|
||||
"\n",
|
||||
"We didn't have to handle normalization in previous chapters because when using a pretrained model through `cnn_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
|
||||
"We didn't have to handle normalization in previous chapters because when using a pretrained model through `vision_learner`, the fastai library automatically adds the proper `Normalize` transform; the model has been pretrained with certain statistics in `Normalize` (usually coming from the ImageNet dataset), so the library can fill those in for you. Note that this only applies with pretrained models, which is why we need to add this information manually here, when training from scratch.\n",
|
||||
"\n",
|
||||
"All our training up until now has been done at size 224. We could have begun training at a smaller size before going to that. This is called *progressive resizing*."
|
||||
]
|
||||
|
@ -919,7 +919,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `cnn_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
|
||||
"Now that we have defined our architecture, and created our parameter matrices, we need to create a `Learner` to optimize our model. In the past we have used special functions, such as `vision_learner`, which set up everything for us for a particular application. Since we are doing things from scratch here, we will use the plain `Learner` class:"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
16
10_nlp.ipynb
16
10_nlp.ipynb
@ -1424,7 +1424,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `cnn_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
|
||||
"It takes quite a while to train each epoch, so we'll be saving the intermediate model results during the training process. Since `fine_tune` doesn't do that for us, we'll use `fit_one_cycle`. Just like `vision_learner`, `language_model_learner` automatically calls `freeze` when using a pretrained model (which is the default), so this will only train the embeddings (the only part of the model that contains randomly initialized weights—i.e., embeddings for words that are in our IMDb vocab, but aren't in the pretrained model vocab):"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2276,9 +2276,21 @@
|
||||
"split_at_heading": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -1209,7 +1209,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `cnn_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
|
||||
"We can now train a model using this `DataLoaders`. It will need a bit more customization than the usual model provided by `vision_learner` since it has to take two images instead of one, but we will see how to create such a model and train it in <<chapter_arch_dtails>>."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -58,21 +58,21 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For computer vision application we use the functions `cnn_learner` and `unet_learner` to build our models, depending on the task. In this section we'll explore how to build the `Learner` objects we used in Parts 1 and 2 of this book."
|
||||
"For computer vision application we use the functions `vision_learner` and `unet_learner` to build our models, depending on the task. In this section we'll explore how to build the `Learner` objects we used in Parts 1 and 2 of this book."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### cnn_learner"
|
||||
"### vision_learner"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's take a look at what happens when we use the `cnn_learner` function. We begin by passing this function an architecture to use for the *body* of the network. Most of the time we use a ResNet, which you already know how to create, so we don't need to delve into that any further. Pretrained weights are downloaded as required and loaded into the ResNet.\n",
|
||||
"Let's take a look at what happens when we use the `vision_learner` function. We begin by passing this function an architecture to use for the *body* of the network. Most of the time we use a ResNet, which you already know how to create, so we don't need to delve into that any further. Pretrained weights are downloaded as required and loaded into the ResNet.\n",
|
||||
"\n",
|
||||
"Then, for transfer learning, the network needs to be *cut*. This refers to slicing off the final layer, which is only responsible for ImageNet-specific categorization. In fact, we do not slice off only this layer, but everything from the adaptive average pooling layer onwards. The reason for this will become clear in just a moment. Since different architectures might use different types of pooling layers, or even completely different kinds of *heads*, we don't just search for the adaptive pooling layer to decide where to cut the pretrained model. Instead, we have a dictionary of information that is used for each model to determine where its body ends, and its head starts. We call this `model_meta`—here it is for resnet-50:"
|
||||
]
|
||||
@ -418,7 +418,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Then we can define our `Learner` by passing the data, model, loss function, splitter, and any metric we want. Since we are not using a convenience function from fastai for transfer learning (like `cnn_learner`), we have to call `learn.freeze` manually. This will make sure only the last parameter group (in this case, the head) is trained:"
|
||||
"Then we can define our `Learner` by passing the data, model, loss function, splitter, and any metric we want. Since we are not using a convenience function from fastai for transfer learning (like `vision_learner`), we have to call `learn.freeze` manually. This will make sure only the last parameter group (in this case, the head) is trained:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -596,7 +596,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Converting an AWD-LSTM language model into a transfer learning classifier, as we did in <<chapter_nlp>>, follows a very similar process to what we did with `cnn_learner` in the first section of this chapter. We do not need a \"meta\" dictionary in this case, because we do not have such a variety of architectures to support in the body. All we need to do is select the stacked RNN for the encoder in the language model, which is a single PyTorch module. This encoder will provide an activation for every word of the input, because a language model needs to output a prediction for every next word.\n",
|
||||
"Converting an AWD-LSTM language model into a transfer learning classifier, as we did in <<chapter_nlp>>, follows a very similar process to what we did with `vision_learner` in the first section of this chapter. We do not need a \"meta\" dictionary in this case, because we do not have such a variety of architectures to support in the body. All we need to do is select the stacked RNN for the encoder in the language model, which is a single PyTorch module. This encoder will provide an activation for every word of the input, because a language model needs to output a prediction for every next word.\n",
|
||||
"\n",
|
||||
"To create a classifier from this we use an approach described in the [ULMFiT paper](https://arxiv.org/abs/1801.06146) as \"BPTT for Text Classification (BPT3C)\":"
|
||||
]
|
||||
@ -761,7 +761,7 @@
|
||||
"1. What is `model_meta`? Try printing it to see what's inside.\n",
|
||||
"1. Read the source code for `create_head` and make sure you understand what each line does.\n",
|
||||
"1. Look at the output of `create_head` and make sure you understand why each layer is there, and how the `create_head` source created it.\n",
|
||||
"1. Figure out how to change the dropout, layer size, and number of layers created by `cnn_learner`, and see if you can find values that result in better accuracy from the pet recognizer.\n",
|
||||
"1. Figure out how to change the dropout, layer size, and number of layers created by `vision_learner`, and see if you can find values that result in better accuracy from the pet recognizer.\n",
|
||||
"1. What does `AdaptiveConcatPool2d` do?\n",
|
||||
"1. What is \"nearest neighbor interpolation\"? How can it be used to upsample convolutional activations?\n",
|
||||
"1. What is a \"transposed convolution\"? What is another name for it?\n",
|
||||
|
@ -110,7 +110,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_learner(**kwargs):\n",
|
||||
" return cnn_learner(dls, resnet34, pretrained=False,\n",
|
||||
" return vision_learner(dls, resnet34, pretrained=False,\n",
|
||||
" metrics=accuracy, **kwargs).to_fp16()"
|
||||
]
|
||||
},
|
||||
@ -181,7 +181,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now let's try plain SGD. We can pass `opt_func` (optimization function) to `cnn_learner` to get fastai to use any optimizer:"
|
||||
"Now let's try plain SGD. We can pass `opt_func` (optimization function) to `vision_learner` to get fastai to use any optimizer:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1238,7 +1238,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"1. What is the equation for a step of SGD, in math or code (as you prefer)?\n",
|
||||
"1. What do we pass to `cnn_learner` to use a non-default optimizer?\n",
|
||||
"1. What do we pass to `vision_learner` to use a non-default optimizer?\n",
|
||||
"1. What are optimizer callbacks?\n",
|
||||
"1. What does `zero_grad` do in an optimizer?\n",
|
||||
"1. What does `step` do in an optimizer? How is it implemented in the general optimizer?\n",
|
||||
|
@ -141,7 +141,7 @@
|
||||
"dls = ImageDataLoaders.from_name_func(\n",
|
||||
" path, get_image_files(path), valid_pct=0.2, seed=21,\n",
|
||||
" label_func=is_cat, item_tfms=Resize(224))\n",
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1)"
|
||||
]
|
||||
},
|
||||
|
@ -107,7 +107,7 @@
|
||||
" path, get_image_files(path), valid_pct=0.2, seed=42,\n",
|
||||
" label_func=is_cat, item_tfms=Resize(224))\n",
|
||||
"\n",
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1)"
|
||||
]
|
||||
},
|
||||
|
@ -358,7 +358,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet18, metrics=error_rate)\n",
|
||||
"learn.fine_tune(4)"
|
||||
]
|
||||
},
|
||||
|
@ -1553,7 +1553,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"dls = ImageDataLoaders.from_folder(path)\n",
|
||||
"learn = cnn_learner(dls, resnet18, pretrained=False,\n",
|
||||
"learn = vision_learner(dls, resnet18, pretrained=False,\n",
|
||||
" loss_func=F.cross_entropy, metrics=accuracy)\n",
|
||||
"learn.fit_one_cycle(1, 0.1)"
|
||||
]
|
||||
|
@ -178,7 +178,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(2)"
|
||||
]
|
||||
},
|
||||
@ -499,7 +499,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1, base_lr=0.1)"
|
||||
]
|
||||
},
|
||||
@ -509,7 +509,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"lr_min,lr_steep = learn.lr_find(suggest_funcs=(minimum, steep))"
|
||||
]
|
||||
},
|
||||
@ -528,7 +528,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(2, base_lr=3e-3)"
|
||||
]
|
||||
},
|
||||
@ -554,7 +554,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fit_one_cycle(3, 3e-3)"
|
||||
]
|
||||
},
|
||||
@ -598,7 +598,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fit_one_cycle(3, 3e-3)\n",
|
||||
"learn.unfreeze()\n",
|
||||
"learn.fit_one_cycle(12, lr_max=slice(1e-6,1e-4))"
|
||||
@ -634,7 +634,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from fastai.callback.fp16 import *\n",
|
||||
"learn = cnn_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
|
||||
"learn = vision_learner(dls, resnet50, metrics=error_rate).to_fp16()\n",
|
||||
"learn.fine_tune(6, freeze_epochs=3)"
|
||||
]
|
||||
},
|
||||
|
@ -295,7 +295,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18)"
|
||||
"learn = vision_learner(dls, resnet18)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -366,7 +366,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
|
||||
"learn = vision_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
|
||||
"learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)"
|
||||
]
|
||||
},
|
||||
@ -580,7 +580,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"learn = cnn_learner(dls, resnet18, y_range=(-1,1))"
|
||||
"learn = vision_learner(dls, resnet18, y_range=(-1,1))"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -40,7 +40,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### cnn_learner"
|
||||
"### vision_learner"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -256,7 +256,7 @@
|
||||
"1. What is `model_meta`? Try printing it to see what's inside.\n",
|
||||
"1. Read the source code for `create_head` and make sure you understand what each line does.\n",
|
||||
"1. Look at the output of `create_head` and make sure you understand why each layer is there, and how the `create_head` source created it.\n",
|
||||
"1. Figure out how to change the dropout, layer size, and number of layers created by `cnn_learner`, and see if you can find values that result in better accuracy from the pet recognizer.\n",
|
||||
"1. Figure out how to change the dropout, layer size, and number of layers created by `vision_learner`, and see if you can find values that result in better accuracy from the pet recognizer.\n",
|
||||
"1. What does `AdaptiveConcatPool2d` do?\n",
|
||||
"1. What is \"nearest neighbor interpolation\"? How can it be used to upsample convolutional activations?\n",
|
||||
"1. What is a \"transposed convolution\"? What is another name for it?\n",
|
||||
|
@ -69,7 +69,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_learner(**kwargs):\n",
|
||||
" return cnn_learner(dls, resnet34, pretrained=False,\n",
|
||||
" return vision_learner(dls, resnet34, pretrained=False,\n",
|
||||
" metrics=accuracy, **kwargs).to_fp16()"
|
||||
]
|
||||
},
|
||||
@ -386,7 +386,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"1. What is the equation for a step of SGD, in math or code (as you prefer)?\n",
|
||||
"1. What do we pass to `cnn_learner` to use a non-default optimizer?\n",
|
||||
"1. What do we pass to `vision_learner` to use a non-default optimizer?\n",
|
||||
"1. What are optimizer callbacks?\n",
|
||||
"1. What does `zero_grad` do in an optimizer?\n",
|
||||
"1. What does `step` do in an optimizer? How is it implemented in the general optimizer?\n",
|
||||
|
@ -47,7 +47,7 @@
|
||||
"dls = ImageDataLoaders.from_name_func(\n",
|
||||
" path, get_image_files(path), valid_pct=0.2, seed=21,\n",
|
||||
" label_func=is_cat, item_tfms=Resize(224))\n",
|
||||
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn = vision_learner(dls, resnet34, metrics=error_rate)\n",
|
||||
"learn.fine_tune(1)"
|
||||
]
|
||||
},
|
||||
|
Loading…
Reference in New Issue
Block a user