fastbook/clean/18_CAM.ipynb

349 lines
8.3 KiB
Plaintext
Raw Normal View History

2020-03-06 18:19:03 +00:00
{
"cells": [
2020-09-03 22:51:00 +00:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#hide\n",
"!pip install -Uqq fastbook\n",
"import fastbook\n",
"fastbook.setup_book()"
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "code",
"execution_count": null,
2020-09-03 22:58:27 +00:00
"metadata": {},
2020-03-06 18:19:03 +00:00
"outputs": [],
"source": [
"#hide\n",
2020-09-03 22:51:00 +00:00
"from fastbook import *"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"# CNN Interpretation with CAM"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"## CAM and Hooks"
2020-03-06 18:19:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"path = untar_data(URLs.PETS)/'images'\n",
"def is_cat(x): return x[0].isupper()\n",
"dls = ImageDataLoaders.from_name_func(\n",
2020-04-15 13:05:34 +00:00
" path, get_image_files(path), valid_pct=0.2, seed=21,\n",
2020-03-06 18:19:03 +00:00
" label_func=is_cat, item_tfms=Resize(224))\n",
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
"learn.fine_tune(1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
2020-09-03 23:50:15 +00:00
"img = PILImage.create(image_cat())\n",
2020-03-06 18:19:03 +00:00
"x, = first(dls.test_dl([img]))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class Hook():\n",
" def hook_func(self, m, i, o): self.stored = o.detach().clone()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hook_output = Hook()\n",
"hook = learn.model[0].register_forward_hook(hook_output.hook_func)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with torch.no_grad(): output = learn.model.eval()(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"act = hook_output.stored[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"F.softmax(output, dim=-1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"dls.vocab"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"cam_map = torch.einsum('ck,kij->cij', learn.model[1][-1].weight, act)\n",
"cam_map.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"x_dec = TensorImage(dls.train.decode((x,))[0][0])\n",
"_,ax = plt.subplots()\n",
"x_dec.show(ctx=ax)\n",
2020-04-15 13:05:34 +00:00
"ax.imshow(cam_map[1].detach().cpu(), alpha=0.6, extent=(0,224,224,0),\n",
2020-03-06 18:19:03 +00:00
" interpolation='bilinear', cmap='magma');"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hook.remove()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class Hook():\n",
" def __init__(self, m):\n",
" self.hook = m.register_forward_hook(self.hook_func) \n",
" def hook_func(self, m, i, o): self.stored = o.detach().clone()\n",
" def __enter__(self, *args): return self\n",
" def __exit__(self, *args): self.hook.remove()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with Hook(learn.model[0]) as hook:\n",
" with torch.no_grad(): output = learn.model.eval()(x.cuda())\n",
" act = hook.stored"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Gradient CAM"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class HookBwd():\n",
" def __init__(self, m):\n",
" self.hook = m.register_backward_hook(self.hook_func) \n",
" def hook_func(self, m, gi, go): self.stored = go[0].detach().clone()\n",
" def __enter__(self, *args): return self\n",
" def __exit__(self, *args): self.hook.remove()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cls = 1\n",
"with HookBwd(learn.model[0]) as hookg:\n",
" with Hook(learn.model[0]) as hook:\n",
" output = learn.model.eval()(x.cuda())\n",
" act = hook.stored\n",
" output[0,cls].backward()\n",
" grad = hookg.stored"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"w = grad[0].mean(dim=[1,2], keepdim=True)\n",
"cam_map = (w * act[0]).sum(0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"_,ax = plt.subplots()\n",
"x_dec.show(ctx=ax)\n",
"ax.imshow(cam_map.detach().cpu(), alpha=0.6, extent=(0,224,224,0),\n",
" interpolation='bilinear', cmap='magma');"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with HookBwd(learn.model[0][-2]) as hookg:\n",
" with Hook(learn.model[0][-2]) as hook:\n",
" output = learn.model.eval()(x.cuda())\n",
" act = hook.stored\n",
" output[0,cls].backward()\n",
" grad = hookg.stored"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"w = grad[0].mean(dim=[1,2], keepdim=True)\n",
"cam_map = (w * act[0]).sum(0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
2020-09-03 22:58:27 +00:00
"outputs": [],
2020-03-06 18:19:03 +00:00
"source": [
"_,ax = plt.subplots()\n",
"x_dec.show(ctx=ax)\n",
"ax.imshow(cam_map.detach().cpu(), alpha=0.6, extent=(0,224,224,0),\n",
" interpolation='bilinear', cmap='magma');"
]
},
2020-04-23 18:24:16 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion"
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Questionnaire"
]
},
2020-03-18 00:34:07 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-19 23:56:41 +00:00
"1. What is a \"hook\" in PyTorch?\n",
2020-03-18 00:34:07 +00:00
"1. Which layer does CAM use the outputs of?\n",
"1. Why does CAM require a hook?\n",
2020-05-19 23:56:41 +00:00
"1. Look at the source code of the `ActivationStats` class and see how it uses hooks.\n",
"1. Write a hook that stores the activations of a given layer in a model (without peeking, if possible).\n",
2020-03-18 00:34:07 +00:00
"1. Why do we call `eval` before getting the activations? Why do we use `no_grad`?\n",
"1. Use `torch.einsum` to compute the \"dog\" or \"cat\" score of each of the locations in the last activation of the body of the model.\n",
2020-05-19 23:56:41 +00:00
"1. How do you check which order the categories are in (i.e., the correspondence of index->category)?\n",
2020-03-18 00:34:07 +00:00
"1. Why are we using `decode` when displaying the input image?\n",
"1. What is a \"context manager\"? What special methods need to be defined to create one?\n",
"1. Why can't we use plain CAM for the inner layers of a network?\n",
2020-05-19 23:56:41 +00:00
"1. Why do we need to register a hook on the backward pass in order to do Grad-CAM?\n",
2020-03-18 00:34:07 +00:00
"1. Why can't we call `output.backward()` when `output` is a rank-2 tensor of output activations per image per class?"
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
2020-05-14 12:18:31 +00:00
"### Further Research"
2020-03-06 18:19:03 +00:00
]
},
2020-03-18 00:34:07 +00:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Try removing `keepdim` and see what happens. Look up this parameter in the PyTorch docs. Why do we need it in this notebook?\n",
2020-05-19 23:56:41 +00:00
"1. Create a notebook like this one, but for NLP, and use it to find which words in a movie review are most significant in assessing the sentiment of a particular movie review."
2020-03-18 00:34:07 +00:00
]
},
2020-03-06 18:19:03 +00:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"jupytext": {
"split_at_heading": true
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}