cosmetic fixes 02

This commit is contained in:
Brad S 2020-03-02 22:14:35 +08:00
parent ea37cbbdda
commit f20dca5f45

View File

@ -29,7 +29,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The five lines of code we've seen in <<chaptter_intro>> are just one small part of the process of using deep learning in practice. In this chapter, we're going to use a computer vision example to look at the end-to-end process of creating a deep learning application. More specifically: we're going to build a bear classifier! In the process, we'll discuss the capabilities and constraints of deep learning, learn about how to create datasets, look at possible gotchas when using deep learning in practice, and more. Let's start with how you should frame your problem.\n",
"The five lines of code we've seen in <<chaptter_intro>> are just one small part of the process of using deep learning in practice. In this chapter, we're going to use a computer vision example to look at the end-to-end process of creating a deep learning application. More specifically: we're going to build a bear classifier! We want you to pick a different problem and follow the same steps to develop your own classifier. In the process, we'll discuss the capabilities and constraints of deep learning, learn about how to create datasets, look at possible gotchas when using deep learning in practice, and more.\n",
"\n",
"Many of the key points in the previous paragraph apply to an image classifier as much as they apply to the other types of machine learning problem we showed in <<chaptter_intro>>. If you work through a problem similar in key respects to our example problems, we expect you to get excellent results with little code, quickly. \n",
"\n",
"Let's start with how you should frame your problem.\n",
"\n",
"TK: the next section title seems a bit inadequate, let's double check"
]
@ -45,7 +49,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We've seen that deep learning can solve a lot of challenging problems quickly and with little code. However, deep learning isn't magic! We often talk to people who overestimate both the constraints, and the capabilities of deep learning. Both of these can be problems: underestimating the capabilities means that you might not even try things which could be very beneficial; underestimating the constraints might mean that you fail to consider and react to important issues.\n",
"We've seen that deep learning can solve a lot of challenging problems quickly and with little code. As a beginner there's a sweet spot of problems that are similar enough to our example problems that you can very quickly get extremely useful results. However, deep learning isn't magic! The same 5 lines of code won't work on every problem anyone can think of today. Underestimating the constraints and overestimating the capabilities of deep learning may lead to frustratingly poor results. At least until you gain some experience to solve the problems that arise. Overestimating the constraints and underestimating the capabilities of deep learning may mean you do not attempt a solvable problem because you talk yourself out of it. \n",
"\n",
"We often talk to people who overestimate both the constraints, and the capabilities of deep learning. Both of these can be problems: underestimating the capabilities means that you might not even try things which could be very beneficial; underestimating the constraints might mean that you fail to consider and react to important issues.\n",
"\n",
"The best thing to do is to keep an open mind. If you remain open to the possibility that deep learning might solve part of your problem with less data or complexity than you expect, then it is possible to design a process where you can find the specific capabilities and constraints related to your particular problem as you work through the process. This doesn't mean making any risky bets — we will show you how you can gradually roll out models so that they don't create significant risks, and can even backtest them prior to putting them in production."
]
@ -89,7 +95,7 @@
"\n",
"Sometimes, you have to get a bit creative. Maybe you can find some previous machine learning project, such as a Kaggle competition, that is related to your field of interest. Sometimes, you have to compromize. Maybe you can't find the exact data you need for the precise project you have in mind; but you might be able to find something from a similar domain, or measured in a different way, tackling a slightly different problem. Working on these kinds of similar projects will still give you a good understanding of the overall process, and may help you identify other shortcuts, data sources, and so forth.\n",
"\n",
"Especially when you are just starting out with deep learning it's not a good idea to branch out into very different areas to places that deep learning has not been applied to before. That's because if your model does not work at first, you will not know whether it is because you have made a mistake, or if the very problem you are trying to solve is simply not solvable with deep learning. And you won't know where to look to get help. Therefore, it is best at first to start with something where you can find an example online of somebody who has had good results with something that is at least somewhat similar to what you are trying to achieve, or where you can convert your data into a format similar what someone else has used before (such as creating an image from your data). Let's have a look at the state of deep learning, jsut so you know what kinds of things deep learning is good at right now."
"Especially when you are just starting out with deep learning it's not a good idea to branch out into very different areas to places that deep learning has not been applied to before. That's because if your model does not work at first, you will not know whether it is because you have made a mistake, or if the very problem you are trying to solve is simply not solvable with deep learning. And you won't know where to look to get help. Therefore, it is best at first to start with something where you can find an example online of somebody who has had good results with something that is at least somewhat similar to what you are trying to achieve, or where you can convert your data into a format similar what someone else has used before (such as creating an image from your data). Let's have a look at the state of deep learning, specifically so you know what kinds of things deep learning is good at right now."
]
},
{
@ -103,7 +109,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"First things first, let's make sure that deep learning cn be any good at the problem you are considering. In general, here is a summary of the state of deep learning is at the start of 2020. However, things move very fast, and by the time you read this some of these constraints may no longer exist. We will try to keep the book website up-to-date; in addition, a Google search for \"what can AI do now\" there is likely to provide some up-to-date information."
"First things first, let's make sure if deep learning is known to be good at the problem you are considering. In general, here is a summary of the state of deep learning is at the start of 2020. However, things move very fast, and by the time you read this some of these constraints may no longer exist. We will try to keep the book website up-to-date; in addition, a Google search for \"what can AI do now\" is likely to provide some up-to-date information."
]
},
{
@ -117,9 +123,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"There are many domains in which deep learning has not been used to analyse images yet, but those where it has been tried have nearly universally shown that computers can recognise what items are in an image at least as well as people can — even specially trained people, such as radiologists. This is known as *object recognition*. Deep learning is also good at recognizing whereabouts objects in an image are, and can highlight their location and name each found object. This is known as *object detection* (there is also a variant of this we saw in <<chapter_intro>>, where every pixel is categorized based on what kind of object it is part of--this is called *segmentation*). Deep learning algorithms are generally not good at recognizing images that are significantly different in structure or style to those used to train the model. For instance, if there were no black-and-white images in the training data, the model may well do poorly on black-and-white images. If the training data did not contain hand-drawn images then the model will probably do poorly on hand-drawn images. There is no general way to check what types of image are missing in your training set, but we will show in this chapter some ways to try to recognize when unexpected image types arise in the data when the model is being used in production (this is known as checking for *out of domain* data).\n",
"There are many domains in which deep learning has not been used to analyse images yet, but those where it has been tried have nearly universally shown that computers can recognise what items are in an image at least as well as people can — even specially trained people, such as radiologists. This is known as *object recognition*. Deep learning is also good at recognizing whereabouts objects in an image are, and can highlight their location and name each found object. This is known as *object detection* (there is also a variant of this we saw in <<chapter_intro>>, where every pixel is categorized based on what kind of object it is part of--this is called *segmentation*). Deep learning algorithms are generally not good at recognizing images that are significantly different in structure or style to those used to train the model. For instance, if there were no black-and-white images in the training data, the model may do poorly on black-and-white images. If the training data did not contain hand-drawn images then the model will probably do poorly on hand-drawn images. There is no general way to check what types of image are missing in your training set, but we will show in this chapter some ways to try to recognize when unexpected image types arise in the data when the model is being used in production (this is known as checking for *out of domain* data). TK previous chapter showed a parabola overfitting. possibly a good place to show out of domain data outside the sampled parabola?\n",
"\n",
"One major challenge for object detection systems is that image labelling can be slow and expensive. There is a lot of work at the moment going into tools to try to make this labelling faster and more easy, and require less handcrafted labels to train accurate object detection models. One approach which is particularly helpful is to synthetically generate variations of input images, such as by rotating them, or changing their brightness and contrast; this is called *data augmentation* and also works well for text and other types of model. We will be discussing it in detail in this chapter.\n",
"One major challenge for object detection systems is that image labelling can be slow and expensive. There is a lot of work at the moment going into tools to try to make this labelling faster and easier, and require less handcrafted labels to train accurate object detection models. One approach which is particularly helpful is to synthetically generate variations of input images, such as by rotating them, or changing their brightness and contrast; this is called *data augmentation* and also works well for text and other types of model. We will be discussing it in detail in this chapter.\n",
"\n",
"Another point to consider is that although your problem might not look like a computer vision problem, it might be possible with a little imagination to turn it into one. For instance, if what you are trying to classify is sounds, you might try converting the sounds into images of their acoustic waveforms and then training a model on those images."
]
@ -135,11 +141,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like in computer vision, computers are very good at categorising both short and long documents based on categories such as spam, sentiment, author, source website, and so forth. We are not aware of any rigourous work done in this area to compare to human performance, but anecdotally it seems to us that deep learning performance is similar to human performance here. Deep learning is also very good at generating context-appropriate text, such as generating replies to social media posts, and imitating a particular author's style. It is also good at making this content compelling to humans, and has been shown to be even more compelling than human-generated text. However, deep learning is currently not good at generating *correct* responses! We don't currently have a reliable way to, for instance, combine a knowledge base of medical information, along with a deep learning model for generating medically correct natural language responses. This is very dangerous, because it is so easy to create content which appears to a layman to be compelling, but actually is entirely incorrect.\n",
"Just like in computer vision, computers are very good at categorising both short and long documents based on categories such as spam, sentiment (e.g. is the review positive or negative), author, source website, and so forth. We are not aware of any rigourous work done in this area to compare to human performance, but anecdotally it seems to us that deep learning performance is similar to human performance here. Deep learning is also very good at generating context-appropriate text, such as generating replies to social media posts, and imitating a particular author's style. It is also good at making this content compelling to humans, and has been shown to be even more compelling than human-generated text. However, deep learning is currently not good at generating *correct* responses! We don't currently have a reliable way to, for instance, combine a knowledge base of medical information, along with a deep learning model for generating medically correct natural language responses. This is very dangerous, because it is so easy to create content which appears to a layman to be compelling, but actually is entirely incorrect.\n",
"\n",
"Another concern is that context-appropriate, highly compelling responses on social media can be used at massive scale — thousands of times greater than any troll farm previously seen — to spread disinformation, create unrest, and encourage conflict. As a rule of thumb, text generation will always be technologically a bit ahead of the ability of models to recognize automatically generated text. For instance, it is possible to use a model that can recognize artificially generated content to actually improve the generator that creates that content, until the classification model is no longer able to complete its task.\n",
"Another concern is that context-appropriate, highly compelling responses on social media can be used at massive scale — thousands of times greater than any troll farm previously seen — to spread disinformation, create unrest, and encourage conflict. As a rule of thumb, text generation will always be technologically a bit ahead of the ability of models to recognize automatically generated text. In fact, it is possible to use a model that can recognize artificially generated content to actually improve the generator that creates that content, until the classification model is no longer able to complete its task.\n",
"\n",
"Despite these issues, deep learning can be used to translate text from one language to another, summarize long documents into something which can be digested more quickly, find all mentions of a concept of interest, and so forth. Unfortunately, the translation or summary could well include completely incorrect information! However, it is already good enough that many people are using the systems — for instance Google's online translation system (and every other online service we are aware of) is based on deep learning."
"Despite these issues, deep learning can be used to translate text from one language to another, summarize long documents into something which can be digested more quickly, find all mentions of a concept of interest, and many more. Unfortunately, the translation or summary could well include completely incorrect information! However, it is already good enough that many people are using the systems — for instance Google's online translation system (and every other online service we are aware of) is based on deep learning."
]
},
{
@ -155,7 +161,7 @@
"source": [
"The ability of deep learning to combine text and images into a single model is, generally, far better than most people intuitively expect. For example, a deep learning model can be trained on input images, and output captions written in English, and can learn to generate surprisingly appropriate captions automatically for new images! But again, we have the same warning that we discussed in the previous section: there is no guarantee that these captions will actually be correct.\n",
"\n",
"Because of this serious issue we generally recommend that deep learning be used not as a entirely automated process, but as part of a process in which the model and a human user interact closely. This can potentially make humans orders of magnitude more productive than they would be with entirely manual methods, and actually result in more accurate processes than using a human alone. For instance, an automatic system can be used to identify potential strokes directly from CT scans, send a high priority alert to have potential/scans looked at quickly. There is only a three-hour window to treat strokes, so this fast feedback loop could save lives. At the same time, however, all scans could continue to be sent to radiologists in the usual way, so there would be no reduction in human input. Other deep learning models could automatically measure items seen on the scan, and insert those measurements into report, warn the radiologist about findings that they may have missed, and tell the radiologist about other cases which might be relevant."
"Because of this serious issue we generally recommend that deep learning be used not as an entirely automated process, but as part of a process in which the model and a human user interact closely. This can potentially make humans orders of magnitude more productive than they would be with entirely manual methods, and actually result in more accurate processes than using a human alone. For instance, an automatic system can be used to identify potential strokes directly from CT scans, send a high priority alert to have potential/scans looked at quickly. There is only a three-hour window to treat strokes, so this fast feedback loop could save lives. At the same time, however, all scans could continue to be sent to radiologists in the usual way, so there would be no reduction in human input. Other deep learning models could automatically measure items seen on the scan, and insert those measurements into reports, warning the radiologist about findings that they may have missed, and tell the radiologist about other cases which might be relevant."
]
},
{
@ -169,7 +175,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For analysing timeseries and tabular data, deep learning has recently been making great strides. However, deep learning is generally used as part of a ensemble of multiple types of model. If you already have a system that is using random forests or gradient boosting machines (popular tabular modelling tools that we will learn about soon) then switching to, or adding, deep learning may not result in any dramatic improvement. Deep learning does greatly increase the variety of columns that you can include, for example columns containing natural language (e.g. book titles, reviews, etc), and *high cardinality categorical* columns (i.e. something that contains a large number of discrete choices, such as zip code or product id). On the downside, deep learning models generally take longer to train than random forests or gradient boosting machines, although this is changing thanks to libraries such as [RAPIDS](https://rapids.ai/), which provides GPU acceleration for the whole modeling pipeline. We cover the pros and cons of all these methods in detail in <<chapter_tabular>> in this book."
"For analysing timeseries and tabular data, deep learning has recently been making great strides. However, deep learning is generally used as part of an ensemble of multiple types of model. If you already have a system that is using random forests or gradient boosting machines (popular tabular modelling tools that we will learn about soon) then switching to, or adding, deep learning may not result in any dramatic improvement. Deep learning does greatly increase the variety of columns that you can include, for example columns containing natural language (e.g. book titles, reviews, etc), and *high cardinality categorical* columns (i.e. something that contains a large number of discrete choices, such as zip code or product id). On the downside, deep learning models generally take longer to train than random forests or gradient boosting machines, although this is changing thanks to libraries such as [RAPIDS](https://rapids.ai/), which provides GPU acceleration for the whole modeling pipeline. We cover the pros and cons of all these methods in detail in <<chapter_tabular>> in this book."
]
},
{
@ -256,7 +262,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For many types of projects, you may be able to find all the data you need online. The project we'll be completing in this chapter is a *bear detector*. It will discriminate between three types of bear: grizzly, black, and teddy bear. There are many images on the internet of each type of bear we can use. We just need a way to find them and download them. We've provided a tool you can use for this purpose, so you can follow along with this chapter, creating your own image recognition application for whatever kinds of object you're interested in. In the fast.ai course, thousands of students have presented their work on the course forums, displaying everything from Trinidad hummingbird varieties, to Panama bus types, and even an application that helped one student let his fiancee recognize his sixteen cousins during Christmas vacation!"
"For many types of projects, you may be able to find all the data you need online. The project we'll be completing in this chapter is a *bear detector*. It will discriminate between three types of bear: grizzly, black, and teddy bear. There are many images on the Internet of each type of bear we can use. We just need a way to find them and download them. We've provided a tool you can use for this purpose, so you can follow along with this chapter, creating your own image recognition application for whatever kinds of object you're interested in. In the fast.ai course, thousands of students have presented their work on the course forums, displaying everything from Trinidad hummingbird varieties, to Panama bus types, and even an application that helped one student let his fiancee recognize his sixteen cousins during Christmas vacation!"
]
},
{
@ -428,7 +434,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Often when we download files from the internet, there's a few that are corrupt. Let's check:"
"Often when we download files from the Internet, there are a few that are corrupt. Let's check:"
]
},
{
@ -466,7 +472,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sidebar: Getting help in jupyter notebooks"
"### Sidebar: Getting help in Jupyter notebooks"
]
},
{
@ -490,7 +496,7 @@
"```\n",
"It tells us what argument the function accepts (`fns`) then shows us the source code and the file it comes from. Looking at that source code, we can see it applies the function `verify_image` in parallel and only keep the ones for which the result of that function is `False`, which is consistent with the doc string: it finds the images in `fns` that can't be opened.\n",
"\n",
"Here are the commands that are very useful in jupyter notebooks:\n",
"Here are the commands that are very useful in Jupyter notebooks:\n",
"\n",
"- at any point, if you don't remember the exact spelling of a function or argument name, you can press \"tab\" to get suggestions of auto-completion.\n",
"- when inside the parenthesis of a function, pressing \"shift\" and \"tab\" simultaneously will display a window with the signature of the function and a short documentation. Pressing it twice will expand the documentation and pressing it three times will open a full window with the same information at the bottom of your screen.\n",
@ -562,7 +568,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have downloaded and verified of the data that we want to use, we need to turn it into a `DataLoaders` object. `DataLoaders` is a thin class which just stores whatever `DataLoader` objects you pass to it, and makes them available as `train` and `valid` . Although it's a very simple class, it's very important in fastai: it provides the data for your model. The key functionality in `DataLoaders` is provided with just these 4 lines of code (it has some other minor functionality we'll skip over for now):\n",
"Now that we have downloaded and verified the data that we want to use, we need to turn it into a `DataLoaders` object. `DataLoaders` is a thin class which just stores whatever `DataLoader` objects you pass to it, and makes them available as `train` and `valid` . Although it's a very simple class, it's very important in fastai: it provides the data for your model. The key functionality in `DataLoaders` is provided with just these 4 lines of code (it has some other minor functionality we'll skip over for now):\n",
"\n",
"```python\n",
"class DataLoaders(GetAttr):\n",
@ -646,7 +652,7 @@
"item_tfms=Resize(128)\n",
"```\n",
"\n",
"Our images are all different sizes, and this is a problem for deep learning: we don't feed the model one image at a time but several (what we call a *mini-batch*) of them. To group them in a big array (usually called *tensor*) that is going to go through our model, they all need to be of the same size. So we need to add a transform twhich will resize these images to the same size. *item transforms* are pieces of code which run on each individual item, whether it be an image, category, or so forth. fastai includes many predefined transforms; we will use the `Resize` transform here.\n",
"Our images are all different sizes, and this is a problem for deep learning: we don't feed the model one image at a time but several (what we call a *mini-batch*) of them. To group them in a big array (usually called *tensor*) that is going to go through our model, they all need to be of the same size. So we need to add a transform which will resize these images to the same size. *item transforms* are pieces of code which run on each individual item, whether it be an image, category, or so forth. fastai includes many predefined transforms; we will use the `Resize` transform here.\n",
"\n",
"This command has given us a `DataBlock` object. This is like a *template* for creating a `DataLoaders`. We still need to tell fastai the actual source of our data — in this case, the path where the images can be found."
]
@ -784,7 +790,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"> note: The second line in this code is a little bit magic, and you absolutely don't have to understand it at this point. So feel free to ignore the entirety of this paragraph! This is for just if you're curious… Showing different randomly varied versions of the same image is not something we normally have to do in deep learning, so it's not something that fastai provides directly. Therefore to draw the picture of data augmentation on the same image, we had to take advantage of fastai's sophisticated customisation features. DataLoader has a method called `get_idx`, which is called to decide which items should be selected next. Normally when we are training, this returns a random permutation of all of the indexes in the dataset. But pretty much everything in fastai can be changed, including how the `get_idx` method is defined, which means we can change how we sample data. So in this case, we are replacing it with a version which always returns the number one. That way, our DataLoader shows the same image again and again! This is a great example of the flexibility that fastai provides. "
"> note: The …get_idxs assignment in this code is a little bit magic, and you absolutely don't have to understand it at this point. So feel free to ignore the entirety of this paragraph! This is just if you're curious… Showing different randomly varied versions of the same image is not something we normally have to do in deep learning, so it's not something that fastai provides directly. Therefore to draw the picture of data augmentation on the same image, we had to take advantage of fastai's sophisticated customisation features. DataLoader has a method called `get_idx`, which is called to decide which items should be selected next. Normally when we are training, this returns a random permutation of all of the indexes in the dataset. But pretty much everything in fastai can be changed, including how the `get_idx` method is defined, which means we can change how we sample data. So in this case, we are replacing it with a version which always returns the number one. That way, our DataLoader shows the same image again and again! This is a great example of the flexibility that fastai provides. "
]
},
{
@ -807,7 +813,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Data augmentation refers to creating random variations of our input data, such that they appear a different, but are not expected to change the meaning of the data. Examples of common data augmentation for images are rotation, flipping, perspective warping, brightness changes, contrast changes, and much more. For natural photo images such as the ones we are using here, there is a standard set of augmentations which we have found work pretty well, and are provided with the get transforms function. Because the images are now all the same size, we can apply these augmentations to an entire batch of them using the GPU, which will save a lot of time. To tell fastai we want to use these transforms to a batch, we use the `batch_tfms` parameter. (Note that's we're not using `RandomResizedCrop` in this example, so you can see the differences more clearly; we're also using double the amount of augmentation compared to the default, for the same reason)."
"Data augmentation refers to creating random variations of our input data, such that they appear different, but are not expected to change the meaning of the data. Examples of common (when appropriate) data augmentation for images are rotation, flipping, perspective warping, brightness changes, contrast changes, and much more. For natural photo images such as the ones we are using here, there is a standard set of augmentations which we have found work pretty well, and are provided with the get transforms function. Because the images are now all the same size, we can apply these augmentations to an entire batch of them using the GPU, which will save a lot of time. To tell fastai we want to use these transforms to a batch, we use the `batch_tfms` parameter. (Note that's we're not using `RandomResizedCrop` in this example, so you can see the differences more clearly; we're also using double the amount of augmentation compared to the default, for the same reason)."
]
},
{
@ -855,7 +861,7 @@
"source": [
"Time to use the same lined of codes as in <<chapter_intro>> to train our bear classifier.\n",
"\n",
"We don't have a lot of data for our pblem (150 pictures of each sort of bear at most), so to train our model, we'll use `RandomResizedCrop` and default `aug_transforms` for our model, on an image size of 224px, which is fairly standard for image classification."
"We don't have a lot of data for our problem (150 pictures of each sort of bear at most), so to train our model, we'll use `RandomResizedCrop` and default `aug_transforms` for our model, on an image size of 224px, which is fairly standard for image classification."
]
},
{
@ -1681,14 +1687,14 @@
"\n",
"There is quite a few upsides to this approach. The initial installation is easier, because you only have to deploy a small GUI application, which connects to the server to do all the heavy lifting. More importantly perhaps, upgrades of that core logic can happen on your server, rather than needing to be distributed to all of your users. Your server can have a lot more memory and processing capacity than most edge devices, and it is far easier to scale those resources if your model becomes more demanding. The hardware that you will have on a server is going to be more standard and more easily supported by fastai and PyTorch, so you don't have to compile your model into a different form.\n",
"\n",
"There are downsides too, of course. Your application will require a network connection, and there will be some latency each time the model is called. It takes a while for a neural network model to run anyway, so this additional network latency may not make a big difference to your users in practice. In fact, since you can use better hardware on the server, the overall latency may even be less! If your application uses sensitive data then your users may be concerned about an approach which sends that data to a remote server, so sometimes privacy considerations will mean that you need to run the model on the edge device. Sometimes this can be avoided by having a *on premise* server, such as inside a company's firewall. Managing the complexity and scaling the server can create additional overhead, whereas if your model runs on the edge devices then each user is bringing their own compute resources, which leads to easier scaling with an increasing number of users (also known as _horizontal scaling_)."
"There are downsides too, of course. Your application will require a network connection, and there will be some latency each time the model is called. It takes a while for a neural network model to run anyway, so this additional network latency may not make a big difference to your users in practice. In fact, since you can use better hardware on the server, the overall latency may even be less! If your application uses sensitive data then your users may be concerned about an approach which sends that data to a remote server, so sometimes privacy considerations will mean that you need to run the model on the edge device. Sometimes this can be avoided by having an *on premise* server, such as inside a company's firewall. Managing the complexity and scaling the server can create additional overhead, whereas if your model runs on the edge devices then each user is bringing their own compute resources, which leads to easier scaling with an increasing number of users (also known as _horizontal scaling_)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> A: I've had a chance to see up close how the mobile ML landscape is changing in my work. We offer an iPhone app that depends on computer vision and for years we ran our own computer vision models in the cloud. This was the only way to do it then since those models needed significant memory and compute resources and took minutes to process. This approach required building not only the models (fun!) but infrastructure to ensure a certain number of \"compute worker machines\" was absolutely always running (scary), that more machines would automatically come online if traffic increased, that there was stable storage for large inputs and outputs, that the iOS app could know and tell the user how their job was doing, etc... Nowadays, Apple provides APIs for converting models to run efficiently on device and most iOS devices have dedicated ML hardware, so we run our new models on device. So, in a few years that strategy has gone from impossible to possible but it's still not easy. In our case it's worth it, for a faster user experiene and to worry less about servers. What works for you will depend, realistically, on the user experience you're trying to create and what you personally find it easy to do. If you really know how to run servers, do it. If you really know how to build native mobile apps, do that. There are many roads up the hill.\n",
"> A: I've had a chance to see up close how the mobile ML landscape is changing in my work. We offer an iPhone app that depends on computer vision and for years we ran our own computer vision models in the cloud. This was the only way to do it then since those models needed significant memory and compute resources and took minutes to process. This approach required building not only the models (fun!) but infrastructure to ensure a certain number of \"compute worker machines\" was absolutely always running (scary), that more machines would automatically come online if traffic increased, that there was stable storage for large inputs and outputs, that the iOS app could know and tell the user how their job was doing, etc... Nowadays, Apple provides APIs for converting models to run efficiently on device and most iOS devices have dedicated ML hardware, so we run our new models on device. So, in a few years that strategy has gone from impossible to possible but it's still not easy. In our case it's worth it, for a faster user experience and to worry less about servers. What works for you will depend, realistically, on the user experience you're trying to create and what you personally find it easy to do. If you really know how to run servers, do it. If you really know how to build native mobile apps, do that. There are many roads up the hill.\n",
"\n",
"Overall, we'd recommend using a simple CPU-based server approach where possible, for as long as you can get away with it. If you're lucky enough to have a very successful application, then you'll be able to justify the investment in more complex deployment approaches at that time.\n",
"\n",
@ -1777,7 +1783,7 @@
"source": [
"One of the biggest challenges in rolling out a model is that your model may change the behaviour of the system it is a part of. For instance, consider YouTube's recommendation system. A couple of years ago Google talked about how they had introduced reinforcement learning (closely related to deep learning, but where your loss function represents a result which could be a long time after an action occurs) to improve their recommendation system. They described how they used an algorithm which made recommendations such that watch time would be optimised.\n",
"\n",
"However, human beings tend to be drawn towards controversial content. This meant that videos about wings like conspiracy theories started to get recommended more and more by the recommendation system. Furthermore, it turns out that the kinds of people that are interested in conspiracy theories are also people that watch a lot of online videos! So, they started to get drawn more and more towards YouTube. The increasing number of conspiracy theorists watching YouTube resulted in the algorithm recommending more and more conspiracy theories and other extremist content, which resulted in more extremists watching videos on YouTube, and more people watching YouTube developing extremist views, which led to the algorithm recommending more extremist content... The system became so out of control that in February 2019 it led the New York Times to run the headline \"YouTube Unleashed a Conspiracy Theory Boom. Can It Be Contained?\"\n",
"However, human beings tend to be drawn towards controversial content. This meant that videos about things like conspiracy theories started to get recommended more and more by the recommendation system. Furthermore, it turns out that the kinds of people that are interested in conspiracy theories are also people that watch a lot of online videos! So, they started to get drawn more and more towards YouTube. The increasing number of conspiracy theorists watching YouTube resulted in the algorithm recommending more and more conspiracy theories and other extremist content, which resulted in more extremists watching videos on YouTube, and more people watching YouTube developing extremist views, which led to the algorithm recommending more extremist content... The system became so out of control that in February 2019 it led the New York Times to run the headline \"YouTube Unleashed a Conspiracy Theory Boom. Can It Be Contained?\"\n",
"\n",
"A helpful exercise prior to rolling out a significant machine learning system is to consider this question: \"what would happen if it went really, really well?\" In other words, what if the predictive power was extremely high, and its ability to influence behaviour was extremely significant? In that case, who would be most impacted? What would the most extreme results potentially look like? How would you know what was really going on?\n",
"\n",
@ -1907,7 +1913,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
"version": "3.7.6"
},
"toc": {
"base_numbering": 1,
@ -1924,5 +1930,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}