2020-03-06 18:19:03 +00:00
{
"cells": [
2020-09-03 22:51:00 +00:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#hide\n",
2022-04-25 05:43:24 +00:00
"! [ -e /content ] && pip install -Uqq fastai # upgrade fastai on colab\n",
2020-09-03 22:51:00 +00:00
"import fastbook\n",
"fastbook.setup_book()"
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "code",
"execution_count": null,
2020-09-03 22:58:27 +00:00
"metadata": {},
2020-03-06 18:19:03 +00:00
"outputs": [],
"source": [
"#hide\n",
2020-09-03 22:51:00 +00:00
"from fastbook import *"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-18 21:18:08 +00:00
"# ResNets"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"## Going Back to Imagenette"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_data(url, presize, resize):\n",
" path = untar_data(url)\n",
" return DataBlock(\n",
" blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, \n",
" splitter=GrandparentSplitter(valid_name='val'),\n",
" get_y=parent_label, item_tfms=Resize(presize),\n",
" batch_tfms=[*aug_transforms(min_scale=0.5, size=resize),\n",
" Normalize.from_stats(*imagenet_stats)],\n",
" ).dataloaders(path, bs=128)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dls = get_data(URLs.IMAGENETTE_160, 160, 128)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"dls.show_batch(max_n=4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def avg_pool(x): return x.mean((2,3))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def block(ni, nf): return ConvLayer(ni, nf, stride=2)\n",
"def get_model():\n",
" return nn.Sequential(\n",
" block(3, 16),\n",
" block(16, 32),\n",
" block(32, 64),\n",
" block(64, 128),\n",
" block(128, 256),\n",
" nn.AdaptiveAvgPool2d(1),\n",
" Flatten(),\n",
" nn.Linear(256, dls.c))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_learner(m):\n",
" return Learner(dls, m, loss_func=nn.CrossEntropyLoss(), metrics=accuracy\n",
" ).to_fp16()\n",
"\n",
"learn = get_learner(get_model())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn.lr_find()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn.fit_one_cycle(5, 3e-3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"## Building a Modern CNN: ResNet"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-18 21:18:08 +00:00
"### Skip Connections"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class ResBlock(Module):\n",
" def __init__(self, ni, nf):\n",
" self.convs = nn.Sequential(\n",
" ConvLayer(ni,nf),\n",
" ConvLayer(nf,nf, norm_type=NormType.BatchZero))\n",
" \n",
" def forward(self, x): return x + self.convs(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def _conv_block(ni,nf,stride):\n",
" return nn.Sequential(\n",
" ConvLayer(ni, nf, stride=stride),\n",
" ConvLayer(nf, nf, act_cls=None, norm_type=NormType.BatchZero))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class ResBlock(Module):\n",
" def __init__(self, ni, nf, stride=1):\n",
" self.convs = _conv_block(ni,nf,stride)\n",
" self.idconv = noop if ni==nf else ConvLayer(ni, nf, 1, act_cls=None)\n",
" self.pool = noop if stride==1 else nn.AvgPool2d(2, ceil_mode=True)\n",
"\n",
" def forward(self, x):\n",
" return F.relu(self.convs(x) + self.idconv(self.pool(x)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def block(ni,nf): return ResBlock(ni, nf, stride=2)\n",
"learn = get_learner(get_model())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn.fit_one_cycle(5, 3e-3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def block(ni, nf):\n",
" return nn.Sequential(ResBlock(ni, nf, stride=2), ResBlock(nf, nf))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn = get_learner(get_model())\n",
"learn.fit_one_cycle(5, 3e-3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"### A State-of-the-Art ResNet"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def _resnet_stem(*sizes):\n",
" return [\n",
" ConvLayer(sizes[i], sizes[i+1], 3, stride = 2 if i==0 else 1)\n",
" for i in range(len(sizes)-1)\n",
" ] + [nn.MaxPool2d(kernel_size=3, stride=2, padding=1)]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"_resnet_stem(3,32,32,64)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class ResNet(nn.Sequential):\n",
" def __init__(self, n_out, layers, expansion=1):\n",
" stem = _resnet_stem(3,32,32,64)\n",
" self.block_szs = [64, 64, 128, 256, 512]\n",
" for i in range(1,5): self.block_szs[i] *= expansion\n",
" blocks = [self._make_layer(*o) for o in enumerate(layers)]\n",
" super().__init__(*stem, *blocks,\n",
" nn.AdaptiveAvgPool2d(1), Flatten(),\n",
" nn.Linear(self.block_szs[-1], n_out))\n",
" \n",
" def _make_layer(self, idx, n_layers):\n",
" stride = 1 if idx==0 else 2\n",
" ch_in,ch_out = self.block_szs[idx:idx+2]\n",
" return nn.Sequential(*[\n",
" ResBlock(ch_in if i==0 else ch_out, ch_out, stride if i==0 else 1)\n",
" for i in range(n_layers)\n",
" ])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rn = ResNet(dls.c, [2,2,2,2])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn = get_learner(rn)\n",
"learn.fit_one_cycle(5, 3e-3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"### Bottleneck Layers"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def _conv_block(ni,nf,stride):\n",
" return nn.Sequential(\n",
" ConvLayer(ni, nf//4, 1),\n",
" ConvLayer(nf//4, nf//4, stride=stride), \n",
" ConvLayer(nf//4, nf, 1, act_cls=None, norm_type=NormType.BatchZero))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dls = get_data(URLs.IMAGENETTE_320, presize=320, resize=224)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rn = ResNet(dls.c, [3,4,6,3], 4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"learn = get_learner(rn)\n",
"learn.fit_one_cycle(20, 3e-3)"
]
},
2020-04-23 18:24:16 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion"
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Questionnaire"
]
},
2020-03-18 00:34:07 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-18 21:18:08 +00:00
"1. How did we get to a single vector of activations in the CNNs used for MNIST in previous chapters? Why isn't that suitable for Imagenette?\n",
2020-03-18 00:34:07 +00:00
"1. What do we do for Imagenette instead?\n",
2020-05-18 21:18:08 +00:00
"1. What is \"adaptive pooling\"?\n",
"1. What is \"average pooling\"?\n",
2020-03-18 00:34:07 +00:00
"1. Why do we need `Flatten` after an adaptive average pooling layer?\n",
2020-05-18 21:18:08 +00:00
"1. What is a \"skip connection\"?\n",
2020-03-18 00:34:07 +00:00
"1. Why do skip connections allow us to train deeper models?\n",
"1. What does <<resnet_depth>> show? How did that lead to the idea of skip connections?\n",
2020-05-18 21:18:08 +00:00
"1. What is \"identity mapping\"?\n",
"1. What is the basic equation for a ResNet block (ignoring batchnorm and ReLU layers)?\n",
"1. What do ResNets have to do with residuals?\n",
"1. How do we deal with the skip connection when there is a stride-2 convolution? How about when the number of filters changes?\n",
2020-05-19 23:56:41 +00:00
"1. How can we express a 1× 1 convolution in terms of a vector dot product?\n",
2020-05-18 21:18:08 +00:00
"1. Create a `1x1 convolution` with `F.conv2d` or `nn.Conv2d` and apply it to an image. What happens to the `shape` of the image?\n",
2020-03-18 00:34:07 +00:00
"1. What does the `noop` function return?\n",
"1. Explain what is shown in <<resnet_surface>>.\n",
"1. When is top-5 accuracy a better metric than top-1 accuracy?\n",
2020-05-18 21:18:08 +00:00
"1. What is the \"stem\" of a CNN?\n",
"1. Why do we use plain convolutions in the CNN stem, instead of ResNet blocks?\n",
2020-03-18 00:34:07 +00:00
"1. How does a bottleneck block differ from a plain ResNet block?\n",
"1. Why is a bottleneck block faster?\n",
2020-05-18 21:18:08 +00:00
"1. How do fully convolutional nets (and nets with adaptive pooling in general) allow for progressive resizing?"
2020-03-18 00:34:07 +00:00
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"### Further Research"
2020-03-06 18:19:03 +00:00
]
},
2020-03-18 00:34:07 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-18 21:18:08 +00:00
"1. Try creating a fully convolutional net with adaptive average pooling for MNIST (note that you'll need fewer stride-2 layers). How does it compare to a network without such a pooling layer?\n",
2020-05-19 23:56:41 +00:00
"1. In <<chapter_foundations>> we introduce *Einstein summation notation*. Skip ahead to see how this works, and then write an implementation of the 1× 1 convolution operation using `torch.einsum`. Compare it to the same operation using `torch.conv2d`.\n",
2020-05-18 21:18:08 +00:00
"1. Write a \"top-5 accuracy\" function using plain PyTorch or plain Python.\n",
2020-03-18 00:34:07 +00:00
"1. Train a model on Imagenette for more epochs, with and without label smoothing. Take a look at the Imagenette leaderboards and see how close you can get to the best results shown. Read the linked pages describing the leading approaches."
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
2020-03-17 19:15:55 +00:00
"source": []
2020-03-06 18:19:03 +00:00
}
],
"metadata": {
"jupytext": {
"split_at_heading": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
2020-03-17 19:15:55 +00:00
"nbformat_minor": 4
2020-03-06 18:19:03 +00:00
}