Fix tabular example

2025-04-04 18:00:48 +00:00 · 2020-03-27 06:21:49 -07:00 · 2020-03-27 06:21:49 -07:00 · 5a2aafdfa2
commit 5a2aafdfa2
parent c640215af0
2 changed files with 16 additions and 66 deletions
--- a/01_intro.ipynb
+++ b/01_intro.ipynb
@ -2,7 +2,7 @@
 "cells": [
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
@ -2188,11 +2188,7 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   },
+   "metadata": {},
   "source": [
    "This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis]((https://ai.stanford.edu/~amaas/data/sentiment/)). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it does its thing:"
   ]
@ -2200,11 +2196,7 @@
  {
   "cell_type": "code",
   "execution_count": null,
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
+   "metadata": {},
   "outputs": [
    {
     "data": {
@ -2302,29 +2294,21 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   },
+   "metadata": {},
   "source": [
    "> jargon: Tabular: Data that is in the form of a table, such as from a spreadsheet, database, or CSV file. A tabular model is a model which tries to predict one column of a table based on information in other columns of a table."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
+   "execution_count": 2,
+   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai2.tabular.all import *\n",
    "path = untar_data(URLs.ADULT_SAMPLE)\n",
    "\n",
-    "dls = TabularDataLoaders.from_csv(path/'adult.csv', path, y_names=\"salary\",\n",
+    "dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names=\"salary\",\n",
    "    cat_names = ['workclass', 'education', 'marital-status', 'occupation',\n",
    "                 'relationship', 'race'],\n",
    "    cont_names = ['age', 'fnlwgt', 'education-num'],\n",
@ -2335,11 +2319,7 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   },
+   "metadata": {},
   "source": [
    "As you see, we had to tell fastai which columns are *categorical* (that is, they contain values that are one of a discrete set of choices, such as `occupation`), versus *continuous* (that is, they contain a number that represents a quantity, such as `age`).\n",
    "\n",
@ -2349,11 +2329,7 @@
  {
   "cell_type": "code",
   "execution_count": null,
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
+   "metadata": {},
   "outputs": [
    {
     "data": {
@ -2407,11 +2383,7 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   },
+   "metadata": {},
   "source": [
    "This model is using the *adult* dataset, from the paper [Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid](https://archive.ics.uci.edu/ml/datasets/adult), which contains some data regarding individuals (like their education, marital status, race, sex, etc.) and whether or not they have an annual income greater than \\$50k. The model is over 80\\% accurate, and took around 30 seconds to train.\n",
    "\n",
@ -2421,11 +2393,7 @@
  {
   "cell_type": "code",
   "execution_count": null,
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
+   "metadata": {},
   "outputs": [
    {
     "data": {
@ -2550,22 +2518,13 @@
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "This model is predicting movie ratings on a scale of 0.5 to 5.0 to within around 0.6 average error. Since we're predicting a continuous number, rather than a category, we have to tell fastai what range our target has, using the `y_range` parameter.\n",
    "\n",
    "Although we're not actually using a pretrained model (for the same reason that we didn't for the tabular model), this example shows that fastai lets us use `fine_tune` even in this case (we'll learn how and why this works later in <<chapter_pet_breeds>>). Sometimes it's best to experiment with `fine_tune` versus `fit_one_cycle` to see which works best for your dataset.\n",
    "\n",
    "We can use the same `show_results` call we saw earlier to view a few examples of user and movie IDs, actual ratings, and predictions:"
-   ],
-   "metadata": {
-    "collapsed": false
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "learn.show_results()"
   ]
  },
  {
@ -2969,17 +2928,8 @@
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
-  },
-  "pycharm": {
-   "stem_cell": {
-    "cell_type": "raw",
-    "source": [],
-    "metadata": {
-     "collapsed": false
-    }
-   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
-}
+}
--- a/clean/01_intro.ipynb
+++ b/clean/01_intro.ipynb
@ -2,7 +2,7 @@
 "cells": [
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1135,14 +1135,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fastai2.tabular.all import *\n",
    "path = untar_data(URLs.ADULT_SAMPLE)\n",
    "\n",
-    "dls = TabularDataLoaders.from_csv(path/'adult.csv', path, y_names=\"salary\",\n",
+    "dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names=\"salary\",\n",
    "    cat_names = ['workclass', 'education', 'marital-status', 'occupation',\n",
    "                 'relationship', 'race'],\n",
    "    cont_names = ['age', 'fnlwgt', 'education-num'],\n",