diff --git a/01_intro.ipynb b/01_intro.ipynb index 695dcdb..e8eebe5 100644 --- a/01_intro.ipynb +++ b/01_intro.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -2188,11 +2188,7 @@ }, { "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, + "metadata": {}, "source": [ "This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis]((https://ai.stanford.edu/~amaas/data/sentiment/)). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it does its thing:" ] @@ -2200,11 +2196,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, + "metadata": {}, "outputs": [ { "data": { @@ -2302,29 +2294,21 @@ }, { "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, + "metadata": {}, "source": [ "> jargon: Tabular: Data that is in the form of a table, such as from a spreadsheet, database, or CSV file. A tabular model is a model which tries to predict one column of a table based on information in other columns of a table." ] }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, + "execution_count": 2, + "metadata": {}, "outputs": [], "source": [ "from fastai2.tabular.all import *\n", "path = untar_data(URLs.ADULT_SAMPLE)\n", "\n", - "dls = TabularDataLoaders.from_csv(path/'adult.csv', path, y_names=\"salary\",\n", + "dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names=\"salary\",\n", " cat_names = ['workclass', 'education', 'marital-status', 'occupation',\n", " 'relationship', 'race'],\n", " cont_names = ['age', 'fnlwgt', 'education-num'],\n", @@ -2335,11 +2319,7 @@ }, { "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, + "metadata": {}, "source": [ "As you see, we had to tell fastai which columns are *categorical* (that is, they contain values that are one of a discrete set of choices, such as `occupation`), versus *continuous* (that is, they contain a number that represents a quantity, such as `age`).\n", "\n", @@ -2349,11 +2329,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, + "metadata": {}, "outputs": [ { "data": { @@ -2407,11 +2383,7 @@ }, { "cell_type": "markdown", - "metadata": { - "pycharm": { - "name": "#%% md\n" - } - }, + "metadata": {}, "source": [ "This model is using the *adult* dataset, from the paper [Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid](https://archive.ics.uci.edu/ml/datasets/adult), which contains some data regarding individuals (like their education, marital status, race, sex, etc.) and whether or not they have an annual income greater than \\$50k. The model is over 80\\% accurate, and took around 30 seconds to train.\n", "\n", @@ -2421,11 +2393,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "pycharm": { - "name": "#%%\n" - } - }, + "metadata": {}, "outputs": [ { "data": { @@ -2550,22 +2518,13 @@ }, { "cell_type": "markdown", + "metadata": {}, "source": [ "This model is predicting movie ratings on a scale of 0.5 to 5.0 to within around 0.6 average error. Since we're predicting a continuous number, rather than a category, we have to tell fastai what range our target has, using the `y_range` parameter.\n", "\n", "Although we're not actually using a pretrained model (for the same reason that we didn't for the tabular model), this example shows that fastai lets us use `fine_tune` even in this case (we'll learn how and why this works later in <>). Sometimes it's best to experiment with `fine_tune` versus `fit_one_cycle` to see which works best for your dataset.\n", "\n", "We can use the same `show_results` call we saw earlier to view a few examples of user and movie IDs, actual ratings, and predictions:" - ], - "metadata": { - "collapsed": false - } - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "learn.show_results()" ] }, { @@ -2969,17 +2928,8 @@ "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" - }, - "pycharm": { - "stem_cell": { - "cell_type": "raw", - "source": [], - "metadata": { - "collapsed": false - } - } } }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +} diff --git a/clean/01_intro.ipynb b/clean/01_intro.ipynb index c9ad32e..b3f94c3 100644 --- a/clean/01_intro.ipynb +++ b/clean/01_intro.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -1135,14 +1135,14 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from fastai2.tabular.all import *\n", "path = untar_data(URLs.ADULT_SAMPLE)\n", "\n", - "dls = TabularDataLoaders.from_csv(path/'adult.csv', path, y_names=\"salary\",\n", + "dls = TabularDataLoaders.from_csv(path/'adult.csv', path=path, y_names=\"salary\",\n", " cat_names = ['workclass', 'education', 'marital-status', 'occupation',\n", " 'relationship', 'race'],\n", " cont_names = ['age', 'fnlwgt', 'education-num'],\n",