mirror of
https://github.com/fastai/fastbook.git
synced 2025-04-05 02:10:48 +00:00
Fix ch09 typos. (#297)
* Fix one typo. * Fix second typo. * Fix additional typos.
This commit is contained in:
parent
b0fd63334b
commit
6e6d913662
@ -8567,7 +8567,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Thinking back to our bear detector, this mirrors the advice that we provided in <<chapter_production>>—it is often a good idea to build a model first and then do your data cleaning, rather than vice versa. The model can help you identify potentially problematic data issues.\n",
|
"Thinking back to our bear detector, this mirrors the advice that we provided in <<chapter_production>>—it is often a good idea to build a model first and then do your data cleaning, rather than vice versa. The model can help you identify potentially problematic data issues.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"It can also help you identifyt which factors influence specific predictions, with tree interpreters."
|
"It can also help you identify which factors influence specific predictions, with tree interpreters."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -8887,7 +8887,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"Remember, a random forest just averages the predictions of a number of trees. And a tree simply predicts the average value of the rows in a leaf. Therefore, a tree and a random forest can never predict values outside of the range of the training data. This is particularly problematic for data where there is a trend over time, such as inflation, and you wish to make predictions for a future time.. Your predictions will be systematically too low.\n",
|
"Remember, a random forest just averages the predictions of a number of trees. And a tree simply predicts the average value of the rows in a leaf. Therefore, a tree and a random forest can never predict values outside of the range of the training data. This is particularly problematic for data where there is a trend over time, such as inflation, and you wish to make predictions for a future time.. Your predictions will be systematically too low.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"But the problem iextends beyond time variables. Random forests are not able to extrapolate outside of the types of data they have seen, in a more general sense. That's why we need to make sure our validation set does not contain out-of-domain data."
|
"But the problem extends beyond time variables. Random forests are not able to extrapolate outside of the types of data they have seen, in a more general sense. That's why we need to make sure our validation set does not contain out-of-domain data."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -9064,7 +9064,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Removing these variables has slightly improved the model's accuracy; but more importantly, it should make it more resilient over time, and easier to maintain and understand. We recommend that for all datasets you try building a model where your dependent variable is `is_valid`, like we did heree. It can often uncover subtle *domain shift* issues that you may otherwise miss.\n",
|
"Removing these variables has slightly improved the model's accuracy; but more importantly, it should make it more resilient over time, and easier to maintain and understand. We recommend that for all datasets you try building a model where your dependent variable is `is_valid`, like we did here. It can often uncover subtle *domain shift* issues that you may otherwise miss.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"One thing that might help in our case is to simply avoid using old data. Often, old data shows relationships that just aren't valid any more. Let's try just using the most recent few years of the data:"
|
"One thing that might help in our case is to simply avoid using old data. Often, old data shows relationships that just aren't valid any more. Let's try just using the most recent few years of the data:"
|
||||||
]
|
]
|
||||||
@ -9810,7 +9810,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"1. Pick a competition on Kaggle with tabular data (current or past) and try to adapt the techniques seen in this chapter to get the best possible results. Compare your results to the private leaderboard.\n",
|
"1. Pick a competition on Kaggle with tabular data (current or past) and try to adapt the techniques seen in this chapter to get the best possible results. Compare your results to the private leaderboard.\n",
|
||||||
"1. Implement the decision tree algorithm in this chapter from scratch yourself, and try it on the datase you used in the first exercise.\n",
|
"1. Implement the decision tree algorithm in this chapter from scratch yourself, and try it on the dataset you used in the first exercise.\n",
|
||||||
"1. Use the embeddings from the neural net in this chapter in a random forest, and see if you can improve on the random forest results we saw.\n",
|
"1. Use the embeddings from the neural net in this chapter in a random forest, and see if you can improve on the random forest results we saw.\n",
|
||||||
"1. Explain what each line of the source of `TabularModel` does (with the exception of the `BatchNorm1d` and `Dropout` layers)."
|
"1. Explain what each line of the source of `TabularModel` does (with the exception of the `BatchNorm1d` and `Dropout` layers)."
|
||||||
]
|
]
|
||||||
|
Loading…
Reference in New Issue
Block a user