mirror of
https://github.com/fastai/fastbook.git
synced 2025-04-04 01:40:44 +00:00
723 lines
15 KiB
Plaintext
723 lines
15 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#hide\n",
|
|
"! [ -e /content ] && pip install -Uqq fastbook\n",
|
|
"import fastbook\n",
|
|
"fastbook.setup_book()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#hide\n",
|
|
"from fastbook import *"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Other Computer Vision Problems"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Multi-Label Classification"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### The Data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from fastai.vision.all import *\n",
|
|
"path = untar_data(URLs.PASCAL_2007)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df = pd.read_csv(path/'train.csv')\n",
|
|
"df.head()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Sidebar: Pandas and DataFrames"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.iloc[:,0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.iloc[0,:]\n",
|
|
"# Trailing :s are always optional (in numpy, pytorch, pandas, etc.),\n",
|
|
"# so this is equivalent:\n",
|
|
"df.iloc[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df['fname']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"tmp_df = pd.DataFrame({'a':[1,2], 'b':[3,4]})\n",
|
|
"tmp_df"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"tmp_df['c'] = tmp_df['a']+tmp_df['b']\n",
|
|
"tmp_df"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### End sidebar"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Constructing a DataBlock"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dblock = DataBlock()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dsets = dblock.datasets(df)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"len(dsets.train),len(dsets.valid)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"x,y = dsets.train[0]\n",
|
|
"x,y"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"x['fname']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dblock = DataBlock(get_x = lambda r: r['fname'], get_y = lambda r: r['labels'])\n",
|
|
"dsets = dblock.datasets(df)\n",
|
|
"dsets.train[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def get_x(r): return r['fname']\n",
|
|
"def get_y(r): return r['labels']\n",
|
|
"dblock = DataBlock(get_x = get_x, get_y = get_y)\n",
|
|
"dsets = dblock.datasets(df)\n",
|
|
"dsets.train[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def get_x(r): return path/'train'/r['fname']\n",
|
|
"def get_y(r): return r['labels'].split(' ')\n",
|
|
"dblock = DataBlock(get_x = get_x, get_y = get_y)\n",
|
|
"dsets = dblock.datasets(df)\n",
|
|
"dsets.train[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),\n",
|
|
" get_x = get_x, get_y = get_y)\n",
|
|
"dsets = dblock.datasets(df)\n",
|
|
"dsets.train[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"idxs = torch.where(dsets.train[0][1]==1.)[0]\n",
|
|
"dsets.train.vocab[idxs]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def splitter(df):\n",
|
|
" train = df.index[~df['is_valid']].tolist()\n",
|
|
" valid = df.index[df['is_valid']].tolist()\n",
|
|
" return train,valid\n",
|
|
"\n",
|
|
"dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),\n",
|
|
" splitter=splitter,\n",
|
|
" get_x=get_x, \n",
|
|
" get_y=get_y)\n",
|
|
"\n",
|
|
"dsets = dblock.datasets(df)\n",
|
|
"dsets.train[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),\n",
|
|
" splitter=splitter,\n",
|
|
" get_x=get_x, \n",
|
|
" get_y=get_y,\n",
|
|
" item_tfms = RandomResizedCrop(128, min_scale=0.35))\n",
|
|
"dls = dblock.dataloaders(df)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dls.show_batch(nrows=1, ncols=3)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Binary Cross-Entropy"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn = vision_learner(dls, resnet18)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"x,y = to_cpu(dls.train.one_batch())\n",
|
|
"activs = learn.model(x)\n",
|
|
"activs.shape"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"activs[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def binary_cross_entropy(inputs, targets):\n",
|
|
" inputs = inputs.sigmoid()\n",
|
|
" return -torch.where(targets==1, inputs, 1-inputs).log().mean()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"loss_func = nn.BCEWithLogitsLoss()\n",
|
|
"loss = loss_func(activs, y)\n",
|
|
"loss"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def say_hello(name, say_what=\"Hello\"): return f\"{say_what} {name}.\"\n",
|
|
"say_hello('Jeremy'),say_hello('Jeremy', 'Ahoy!')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"f = partial(say_hello, say_what=\"Bonjour\")\n",
|
|
"f(\"Jeremy\"),f(\"Sylvain\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn = vision_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))\n",
|
|
"learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn.metrics = partial(accuracy_multi, thresh=0.1)\n",
|
|
"learn.validate()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn.metrics = partial(accuracy_multi, thresh=0.99)\n",
|
|
"learn.validate()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"preds,targs = learn.get_preds()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"accuracy_multi(preds, targs, thresh=0.9, sigmoid=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"xs = torch.linspace(0.05,0.95,29)\n",
|
|
"accs = [accuracy_multi(preds, targs, thresh=i, sigmoid=False) for i in xs]\n",
|
|
"plt.plot(xs,accs);"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Regression"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Assemble the Data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"path = untar_data(URLs.BIWI_HEAD_POSE)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#hide\n",
|
|
"Path.BASE_PATH = path"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"path.ls().sorted()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"(path/'01').ls().sorted()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"img_files = get_image_files(path)\n",
|
|
"def img2pose(x): return Path(f'{str(x)[:-7]}pose.txt')\n",
|
|
"img2pose(img_files[0])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"im = PILImage.create(img_files[0])\n",
|
|
"im.shape"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"im.to_thumb(160)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"cal = np.genfromtxt(path/'01'/'rgb.cal', skip_footer=6)\n",
|
|
"def get_ctr(f):\n",
|
|
" ctr = np.genfromtxt(img2pose(f), skip_header=3)\n",
|
|
" c1 = ctr[0] * cal[0][0]/ctr[2] + cal[0][2]\n",
|
|
" c2 = ctr[1] * cal[1][1]/ctr[2] + cal[1][2]\n",
|
|
" return tensor([c1,c2])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"get_ctr(img_files[0])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"biwi = DataBlock(\n",
|
|
" blocks=(ImageBlock, PointBlock),\n",
|
|
" get_items=get_image_files,\n",
|
|
" get_y=get_ctr,\n",
|
|
" splitter=FuncSplitter(lambda o: o.parent.name=='13'),\n",
|
|
" batch_tfms=aug_transforms(size=(240,320)), \n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dls = biwi.dataloaders(path)\n",
|
|
"dls.show_batch(max_n=9, figsize=(8,6))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"xb,yb = dls.one_batch()\n",
|
|
"xb.shape,yb.shape"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"yb[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Training a Model"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn = vision_learner(dls, resnet18, y_range=(-1,1))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def sigmoid_range(x, lo, hi): return torch.sigmoid(x) * (hi-lo) + lo"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"plot_function(partial(sigmoid_range,lo=-1,hi=1), min=-4, max=4)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dls.loss_func"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn.lr_find()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"lr = 1e-2\n",
|
|
"learn.fine_tune(3, lr)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"math.sqrt(0.0001)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"learn.show_results(ds_idx=1, nrows=3, figsize=(6,8))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Conclusion"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Questionnaire"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"1. How could multi-label classification improve the usability of the bear classifier?\n",
|
|
"1. How do we encode the dependent variable in a multi-label classification problem?\n",
|
|
"1. How do you access the rows and columns of a DataFrame as if it was a matrix?\n",
|
|
"1. How do you get a column by name from a DataFrame?\n",
|
|
"1. What is the difference between a `Dataset` and `DataLoader`?\n",
|
|
"1. What does a `Datasets` object normally contain?\n",
|
|
"1. What does a `DataLoaders` object normally contain?\n",
|
|
"1. What does `lambda` do in Python?\n",
|
|
"1. What are the methods to customize how the independent and dependent variables are created with the data block API?\n",
|
|
"1. Why is softmax not an appropriate output activation function when using a one hot encoded target?\n",
|
|
"1. Why is `nll_loss` not an appropriate loss function when using a one-hot-encoded target?\n",
|
|
"1. What is the difference between `nn.BCELoss` and `nn.BCEWithLogitsLoss`?\n",
|
|
"1. Why can't we use regular accuracy in a multi-label problem?\n",
|
|
"1. When is it okay to tune a hyperparameter on the validation set?\n",
|
|
"1. How is `y_range` implemented in fastai? (See if you can implement it yourself and test it without peeking!)\n",
|
|
"1. What is a regression problem? What loss function should you use for such a problem?\n",
|
|
"1. What do you need to do to make sure the fastai library applies the same data augmentation to your input images and your target point coordinates?"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Further Research"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"1. Read a tutorial about Pandas DataFrames and experiment with a few methods that look interesting to you. See the book's website for recommended tutorials.\n",
|
|
"1. Retrain the bear classifier using multi-label classification. See if you can make it work effectively with images that don't contain any bears, including showing that information in the web application. Try an image with two different kinds of bears. Check whether the accuracy on the single-label dataset is impacted using multi-label classification."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"jupytext": {
|
|
"split_at_heading": true
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|