Using Azure Machine Learning and SharePoint to predict coffee consumption

Machine Learning has been a hot topic for a little while now. Increased general awareness, improvements in machine learning services and lowering costs of entry have made it a more popular topic than ever to discuss and speculate about. Plenty of machine learning implementations exist out there already, too, but for many it is still a bit of a grey area: “I’ve heard that machine learning is great, and that it can do a lot, but how does it really work? How can I really get benefits from it?” To try and shed some light into that, I present you Dark Roast Ltd., a fictional Helsinki-based software company that used Azure Machine Learning to improve employee satisfaction and productivity and to save money.

Coffee must flow

Dark Roast Ltd. has had a problem: Their software developers really love coffee; in fact, they outright need large amounts of it to get their work done. However, their coffee consumption has also fluctuated a lot, leading to a conundrum: Either they can brew coffee for their developers whenever they want it – meaning that no coffee goes to waste, but the developers end up waiting for their caffeine boosts and getting frustrated in the process – or they can brew lots of coffee in advance. That way their developers do get their fix quickly, but Dark Roast’s HR expenses go through the roof as they end up making more expensive artisan coffee than what gets actually consumed. Neither of these choices is a practical approach for them, so for the longest time Dark Roast Ltd.’s baristas have been brewing their coffee based on pure guesses. And while they’ve been doing a reasonably good job trying to estimate how much coffee gets drunk each day, they expect that things could be done better.

In order to optimize this problem Dark Roast Ltd. turned to Azure Machine Learning. For a while now they have tracked both their coffee consumption and the number of people at their premises each day on their intranet in a SharePoint list, and now they are hoping to see if this information could be used to predict consumption of coffee so that they could brew an optimal amount for their developers each day. In their SharePoint list they track six values; five parameters that they expect to affect how much coffee is being drunk and the amount of coffee itself:

  • Time of year, which is either winter, spring, summer, or autumn. The people responsible for stocking coffee beans have noticed that coffee is being drunk more during the dark winter months than in the summer.
  • Time of day, which is either morning or evening. It’s also been noticed that coffee gets drunk more during the mornings.
  • Number of Finnish people at the office. Being a Helsinki-based company Finns make up the majority of Dark Roast Ltd.’s workforce, and since we are notorious coffee drinkers, they decided to track the number of Finns at the office separately.
  • Number of British people. Dark Roast Ltd. has an office in London and as a result they frequently get visitors from there in their Helsinki offices as well. It’s also estimated that the Brits drink less coffee than other people do, as their beverage of choice is often tea. Because of this preference Brits are tracked separately as well.
  • Number of other nationalities present. Dark Roast Ltd. has a multinational workforce, and they expect that these people drink less coffee than the Finns do – but more than Brits would.
  • Total cups of coffee consumed, which is the value the machine learning model is going to try and predict.

Solving this problem with machine learning involves three general steps:

  1. Training a machine learning model capable of making predictions based on historical data
  2. Deploying the machine learning model somewhere where it can be used
  3. Developing a solution that makes use of the machine learning model

Let’s take a closer look at how and why Dark Roast Ltd. tackled these things the way they did.

1. Training a machine learning model

If you are new to the world of machine learning there is a lot of jargon to get your head around, but by far the single most important word to understand is the model: In layman’s terms, a machine learning model is where all the magic happens. It’s the artifact produced as a result of processing historical data with machine learning algorithms. A model is the golden goose, if you will, the piece of software that produces valuable nuggets of information (or predictions).

There are many ways to go around training machine learning models, but at Dark Roast Ltd. they realized two things: First, they already have lots of in-house Azure-expertise, so using Azure Machine Learning sounded like a great starting point for them. Second, while they have lots of skilled developers at hand, they don’t have any full-fledged data scientists with the time to spare for an in-house development project. So, they decided to give Azure ML’s Automated Machine Learning a try. AutoML is a service within Azure Machine Learning that uses machine learning to train machine learning models. ML-ception much?

Using AutoML turns implementing machine learning solutions more into data engineering and software development tasks in comparison to traditional model training that requires more data science expertise. Of course, whether your machine learning models are being trained by an AI or a human the underlying processes and algorithms remain the same. As such, the decision on which to choose comes down to one thing: Whether a skilled human being, with knowledge of both data science and the business domain at hand, is able to produce better optimized models than an AI would – and whether these models’ improved accuracy is worth the extra investment of hiring such a skilled and sought-after person to do the job.

As mentioned before, Dark Roast Ltd. went on with the AutoML route since tying down data science specialists into writing training scripts for an internal software project wasn’t worth it for them. In order to use AutoML all you need to start with is data, and that’s what Dark Roast Ltd. did have. I’ll skip the details of setting up an Azure Machine Learning workspace and how to use AutoML for now, but I’ve attached a copy of Dark Roast Ltd.’s coffee consumption data below as an excel file so you can play around with it yourself too!

As a process machine learning works by applying various statistical algorithms to the historical data. The algorithms try to make sense of how much different values of each parameter in the data affect the end result that we are looking forward to predicting. To give you a simple example, if the number of people present at Dark Roast Ltd. remain the same, how much would a change of season affect their morning consumption of coffee? Or how much would a change in the number of Finnish people affect the predictions if everything else stays the same? And so on, and so on.

AutoML automatically generates something called explanations for the best machine learning model it trains: Explanations give some insight into how the model reaches the conclusions it makes by sorting the various parameters into an order of importance. The more important a parameter is, the more changes in its values affect the predictions the model makes. It’s important to keep in mind that this does not mean that the same order of importance occurs in the real world as well – it’s only the model’s interpretation of the situation! For example, here’s the explanations that AutoML generated for Dark Roast Ltd.’s coffee model, with the most important factor being on the left and least important on the right:

As you can see, the time of year is the single most important factor in predicting how much coffee gets consumed at Dark Roast Ltd…at least if you ask the machine learning model!

2. Deploying the machine learning model somewhere where it can be used

Training a model is definitely the most crucial part of any machine learning project, since that’s where the machine learning itself happens. However, having a model by itself is of no value if it cannot be used. That’s where deployment comes in, making the model available for software to use. There are a lot of different deployment options, although in general they can be bundled into one of two categories: Deploying the model as a part of a bigger software solution or deploying the model as an API for other solutions to use.

Dark Roast Ltd., being specialists in Azure, decided to use the cost-effective option of Azure Functions to host their machine learning model as an API. With Azure Functions, they could deploy the model with running costs of about 10 euros a month. See my previous post How to: Easily deploying Azure Machine Learning models to Azure Functions to see how they did it! And since the model is separated from other software, it can be updated on its own without requiring any changes to the solutions that use it.

3. Developing a solution that makes use of the machine learning model

At this point Dark Roast Ltd. has trained a machine learning model and deployed it to Azure waiting to be used. The only thing that’s left for them is to provide their coffee specialists with a way to get predictions out of the model. Dark Roast Ltd. is using SharePoint Online for their intranet, so it makes perfect sense for them to develop a SharePoint-based solution that their employees could use for predicting coffee consumption.

In order to do this, they created a SharePoint Framework web-part using the React framework. I’ll leave implementing your own web-part as a personal exercise, since there’s a wealth of great resources already on the Internet, but I’ll provide you with the function that Dark Roast Ltd. used to call their machine learning model. It performs a basic HTTP POST request, but the formatting of the json in the request body, and how the response body is parsed, are specific to Azure Machine Learning’s models.

private onPredictCoffeeClicked(): void {
  if (this.state
      && this.state.timeOfDay
      && this.state.timeOfYear
      && this.state.numberOfBritishPeople
      && this.state.numberOfFinnishPeople
      && this.state.numberOfOtherNationalities){
    var data = {
      data: [
        {
          "Season": this.state.timeOfYear,
          "Time of day": this.state.timeOfDay,
          "Finnish people": this.state.numberOfFinnishPeople,
          "British people": this.state.numberOfBritishPeople,
          "Other nationalities": this.state.numberOfOtherNationalities
        }
      ]
    };

    var options : IHttpClientOptions = {
      body: JSON.stringify(data)
    };

    this.props.httpClient
      .post('https://...azurewebsites.net/api/azureml-service?code=...',
            HttpClient.configurations.v1, options)
      .then((res: HttpClientResponse): void => {
        res.json().then((result): void => {
          var prediction = Number(JSON.parse(result)["result"]);
          this.setState({
            predictionResult: Math.floor(prediction)
          });
        });
      });
  }
}

Note! Since Dark Roast Ltd. is using the machine learning model from a browser-based client-side solution they had to take into account Cross-Origin Resource Sharing, or CORS, which is going to block requests from the SharePoint web-part into the Azure-based model. To solve this issue, they had to insert the domain name of their SharePoint tenant into the CORS settings of the Azure Function app where the machine learning model was deployed to:

In addition to simply calling the machine learning model for predictions, at Dark Roast Ltd. they retrieved the model’s mean absolute error -metric, which gives a rough estimate of how much the prediction can be wrong. In other words, if the model’s prediction is 100, and the mean absolute error is 10, then the actual real value is likely to be between 90 and 110. Dark Roast Ltd. used this value as a configuration parameter in their web-part to recommend their coffee specialists to brew a few cups extra just in case – because for them it’s better to make a little more coffee than they actually need than to make less than the caffeine-thirsty developers would want to drink. The resulting web-part they made ended up looking like this:

What’s next?

After they launched the Coffee Pot Predictor app, employee satisfaction and productivity shot through the roof at Dark Roast Ltd. as they always had enough coffee available. No more lines of grumpy developers waiting for their caffeine fixes, and all of that without having to ridiculously overestimate their coffee consumption either! They still were generally making more coffee than what was needed by a small margin, so an idea rose to implement automatic re-training of the machine learning model once a week utilizing their new coffee consumption data. This way they could get even more accurate predictions as time went on without having anyone manually perform the tedious of job of exporting SharePoint list data, converting it to a csv file, re-training the model and finally deploying it to Azure.

But for Dark Roast Ltd., that’s a story for another time. Until next time, see ya! 🙂


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s