mirror of
https://github.com/fastai/fastbook.git
synced 2025-04-04 18:00:48 +00:00
Update 09_tabular to fastai v2.2.7 (#413)
saleElaped is now detected as continuous variable right away.
This commit is contained in:
parent
c3ceea7996
commit
8be580737e
@ -9366,33 +9366,27 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In this case, however, there's one variable that we absolutely do not want to treat as categorical: the `saleElapsed` variable. A categorical variable cannot, by definition, extrapolate outside the range of values that it has seen, but we want to be able to predict auction sale prices in the future. Therefore, we need to make this a continuous variable:"
|
||||
"In this case, there's one variable that we absolutely do not want to treat as categorical: the `saleElapsed` variable. A categorical variable cannot, by definition, extrapolate outside the range of values that it has seen, but we want to be able to predict auction sale prices in the future. Let's verify that `cont_cat_split` did the correct thing."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 98,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['saleElapsed']"
|
||||
]
|
||||
},
|
||||
"execution_count": 98,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"cont_nn.append('saleElapsed')\n",
|
||||
"cat_nn.remove('saleElapsed')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Also, to use this as a continuous variable, we have to ensure it's of a numeric type:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 106,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"df_nn['saleElapsed'] = df_nn['saleElapsed'].astype(int)"
|
||||
"cont_nn"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -9975,7 +9969,7 @@
|
||||
"1. What's a good type of plot for showing tree interpreter results?\n",
|
||||
"1. What is the \"extrapolation problem\"?\n",
|
||||
"1. How can you tell if your test or validation set is distributed in a different way than your training set?\n",
|
||||
"1. Why do we make `saleElapsed` a continuous variable, even although it has less than 9,000 distinct values?\n",
|
||||
"1. Why do we ensure `saleElapsed` is a continuous variable, even although it has less than 9,000 distinct values?\n",
|
||||
"1. What is \"boosting\"?\n",
|
||||
"1. How could we use embeddings with a random forest? Would we expect this to help?\n",
|
||||
"1. Why might we not always use a neural net for tabular modeling?"
|
||||
|
@ -1153,17 +1153,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"cont_nn.append('saleElapsed')\n",
|
||||
"cat_nn.remove('saleElapsed')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"df_nn['saleElapsed'] = df_nn['saleElapsed'].astype(int)"
|
||||
"cont_nn"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1375,7 +1365,7 @@
|
||||
"1. What's a good type of plot for showing tree interpreter results?\n",
|
||||
"1. What is the \"extrapolation problem\"?\n",
|
||||
"1. How can you tell if your test or validation set is distributed in a different way than your training set?\n",
|
||||
"1. Why do we make `saleElapsed` a continuous variable, even although it has less than 9,000 distinct values?\n",
|
||||
"1. Why do we ensure `saleElapsed` is a continuous variable, even although it has less than 9,000 distinct values?\n",
|
||||
"1. What is \"boosting\"?\n",
|
||||
"1. How could we use embeddings with a random forest? Would we expect this to help?\n",
|
||||
"1. Why might we not always use a neural net for tabular modeling?"
|
||||
|
Loading…
Reference in New Issue
Block a user