Recipe for Machine Learning Applications: Steps for Continuous Improvement

January 25, 2017 | By George Gilbert |

Analysis, Big Data

Premise: SaaS-delivered machine learning (ML) applications can obviate the need for hard-to-find data scientists working on esoteric, custom applications. Three tasks bootstrap the process of getting started with an ML application. Mastering three additional tasks ensures continuous ML improvements.

5ForcesMLMaster — **Figure 1: Machine learning applications build on an underlying Digital Business Platform. The gray callouts are the additional machine learning-related capabilities.**

Machine learning applications typically affect top and bottom-line results by driving better decisions. By contrast, systems of record drive operational efficiency by keeping track of transactions. ML apps have another distinction. What differentiates ML applications is the data they accumulate (see Figure 1). Data feedback loops improve application effectiveness by refining performance based on outcomes generated during previous execution periods. In other words, predictions and prescriptions get better with time because the feedback loops continually retrain the machine learning models. As a result, early adopters in a particular ML application category can accumulate more feedback that leads to better models. Better models confer advantages similar to those of a platform owner with a first mover advantage: the first mover has the opportunity to marginalize late arriving competitors. The ML apps don’t have to be custom-built. Off-the-shelf SaaS applications can serve just as effectively in managing data feedback loops. Once up and running, enterprises have to ensure the apps follow a recipe that is a cycle of three steps to ensure they can achieve and maintain their first mover advantage — or attack another firm’s first mover advantage (see Figure 2):

Measure the models’ output to make sure they get better by learning from experience. Harnessing ML model data feedback loops are critical to providing sustainable differentiation relative to competitors deploying similar applications.
Continually add “expertise” to the models to capture more of the nuances of an industry, department, or function over time. Vendors and their enterprise customers have to pay extremely close attention to just how much of this higher-level expertise is shared beyond the experience curve embedded in each enterprise’s data feedback loops.
Continually tighten ML apps’ integration with systems of record. Many decisions are time critical. The tighter the integration, the closer operational decisions can become to autonomous and real-time.

FeedbackLoopFinal3 — **Figure 2: The steps required to continually measure and retrain ML models’ output, add additional “expertise” to the models, and further integrate the models with systems of record.**

Measure the model’s’ output to make sure it gets better by learning from experience.

Data feedback loops makes the machine learning models more accurate once the application is in production. This capability is called model retraining. Retraining adjusts the “knobs,” or weightings, given to each input in the predictive model so that its accuracy continues to get better as it keeps encountering new data. An application predicting cable subscription churn will get more accurate as it learns from false positives or false negatives.

Model retraining is typically far more compute-intensive than the process of making the predictions, also called scoring. At its simplest, prediction is based on an equation that produces an answer, such as the likelihood a customer will drop their subscription to a cable service provider. The equation has a number of variables that might include number of technical support incidents and hours viewed in the last month. In addition, the equation has weightings for each variable that indicate how important each one is in determining whether the customer is going to churn. Running predictions requires less overhead than training because predictions only need values for the number of technical support incidents and hours viewed to get an answer. Training, by contrast, has a much more tedious and compute intensive task. Training has to take a batch of records and figure out the weightings to assign to tech support incidents and hours viewed. In other words, training involves a sophisticated process of iteratively adjusting the weightings over and over until the model’s answers line up closely to the real answers in the training data. As a result of this extra overhead, decisions about how often to retrain need to balance time and cost versus improvements in the predictions.

Continually add “expertise” to the models to capture more of the nuances of an industry, department, or function over time.

As the application vendor continues to deepen its domain expertise, it needs to continually refine the model not just with new weightings but also additional factors that improve predictions. For example, an application vendor providing a subscription churn model for cable companies might learn to take into account not just seasonality but the start and end dates of highly popular television series at one of the cable companies that is its customer.

Enterprise customers must have explicit agreements with their application vendors to segregate their data so there is no sharing of customer experience curves. But sometimes it’s hard to segregate deeper industry knowledge from a data feedback loop. For example, accounting for the popularity of television series in churn models might have originated with one cable company customer. But the application vendor might generalize those learnings in the models that serve other cable companies. If a customer doesn’t want their models with richer nuances to be shared with other customers, the customer needs to make sure this is part of their contractual agreement with their vendor. For the vendor, they need to ensure they can operate multi-tenant applications where individual customer models start to diverge. This separation can be tricky but it’s possible that the vendor might be able to have one model that’s common to all customers and then customer-specific ones that run in conjunction with the core ones.

Continually tighten ML apps’ integration with Systems of Record.

Integrating the ML application with the system of record turns the predictions or prescriptions into operational transactions. Ideally the output of the ML apps should drive operational transactions in systems of record automatically. But if the integration isn’t automatic, there should be a minimum number of steps to achieve that integration, such as a straight file export from the ML app and an import into the system of record.
Decisions that drive operations are becoming more time-sensitive. Credit card authorizations have always been this way. But more traditional applications that integrate systems of record with machine learning are getting more time-sensitive. Ecommerce sites increasingly have to serve a next best offer based on real-time customer behavior, not based on an overnight update of behavior. The Netflix recommender has to update the choices presented to consumers while they are browsing live, not when they return for their next session. As long as humans have to be in the loop in order to bridge an integration gap, applications won’t be able to work together in near real-time. As more applications require real-time decisions, ML apps and systems of record will require automated integration. The burden of integration will be on both types of application vendors to achieve this type of operation. And enterprises will have to engage in more involved proofs of concept and pilots as the systems integration gets more complex.

Action Item: Customers have to change their mental models about how to get value out of the emerging class of machine learning applications. Machine learning applications have the unique property of getting continually more effective as they learn from experience. Enterprise customers must ensure that the combination of their data feedback loops and their machine learning algorithms form a living model that gets smarter faster than their competitors. Customers must also make ever tighter integration between their systems of record and ML applications a priority because ever smarter decisions will need to be ever faster.

George Gilbert

George Gilbert, lead data & analytics analyst for theCUBE Research. Former Gartner analyst, former lead enterprise software analyst for Credit Suisse First Boston, one of the top investment banks serving the technology sector. Big Data analyst for Gigaom Research. Co-founded Techalphapartners, a consultancy that advised vendors and institutional investors on market development and product strategy. George has led conference panels with prominent thought leaders in cloud infrastructure and big data. He has been profiled on the front page of the Wall Street Journal and published as a guest author in a major overview of the evolution of cloud computing in The Economist. Prior to being an analyst, George was a product manager on Notes at Lotus Development. George received his BA in economics from Harvard University.

You may also be interested in

Pega CEO Alan Trefler Looks Ahead to PegaWorld Inspire, June 9-11

Shelly Kramer May 16, 2024

CISA’s Secure by Design Pledge Continues to Build Momentum: Is it Basic? Maybe, but it’s a Start

Shelly Kramer May 15, 2024

Recipe for Machine Learning Applications: Steps for Continuous Improvement

George Gilbert

You may also be interested in

Pega CEO Alan Trefler Looks Ahead to PegaWorld Inspire, June 9-11

CISA’s Secure by Design Pledge Continues to Build Momentum: Is it Basic? Maybe, but it’s a Start

Studio Locations

Research Areas

Podcasts

Solutions

Engage

Stay Connected

theCUBE Research weekly

Recipe for Machine Learning Applications: Steps for Continuous Improvement

George Gilbert

You may also be interested in

Pega CEO Alan Trefler Looks Ahead to PegaWorld Inspire, June 9-11

CISA’s Secure by Design Pledge Continues to Build Momentum: Is it Basic? Maybe, but it’s a Start

Studio Locations

Research Areas

Podcasts

Solutions

Engage

Stay Connected

theCUBE Research weekly

Book A Briefing