Premise

Complexity is inherent early in the lifecycle of all products and platforms.  The products include a rich mix of services because custom work is necessary to take the core of a product and make it a full solution.  So it is with big data analytic applications today.  In this research report we provide guidelines for how to budget for the key attributes of Systems of Intelligence such as accuracy and speed.  These guidelines provide business sponsors and IT developers and operations leaders a common language to agree on requirements.

What’s New Since Original Version of the Manifesto

Slide1
Figure 1: Definition of Systems of Record
Source: © Wikibon

The Biggest Change in Enterprise Applications in 50 Years

Wikibon has argued that Systems of Intelligence represent the biggest change in enterprise applications in 5 decades.  Conventional wisdom is that systems of record only really appeared with the advent of packaged applications in the ’80s.

Rather, we believe that since enterprise applications began to be built in the early ’60s, they focused on improving business process efficiency just like modern Systems of Record.  Whether these applications run on time-shared  mainframes, Windows clients and Unix servers, or in the cloud via SaaS, they all did essentially the same  thing.  Their analytics amounted to historical performance business intelligence and reporting.  It was like steering a ship by looking backwards at its wake.

We are very early in the journey to Systems of Intelligence, by contrast.

Slide2
Figure 2: The Analytic Data Pipeline is actually two pipelines. One uses data that has been typically historical to make better predictions. The other uses a prediction based on the most recent data to drive the best transaction.
Source: © Wikibon

Systems  of Intelligence Build on Systems of Record With Near Real-Time Analytics

Systems of Intelligence optimize loyalty by anticipating and influencing the consumer “dialog” without requiring a human customer service or sales representative.  They know what’s going on around customers, both socially and geographically.

But most important is that Systems of Intelligence are forward looking.  They are predictive, prescriptive, and proactive in near real-time in order to inform the customer interaction or any other part of the process.  A person may not be involved at all, like with fraud prevention.  A decision has to be made within a few hundred milliseconds whether to decline a credit card.

That forward looking intelligence has to be able to act in near real-time across all channels and touch points in order for the application to be effective.

Slide3
Figure 3: Analytic Data Pipeline in Systems of Record: Traditional ETL pipelines extracted only the data that was needed for a data warehouse and then transformed it for analytic use. This process made the data accessible to business analysts but the pipeline was difficult to change. As a result, the agility of improving the analytics was very low.
Source: © Wikibon

Learning from Pipelines in Systems of Record

Systems of Intelligence build on Systems of Record in most cases.  Only rarely will they replace them.  While the old pipelines will continue to feed traditional reporting, customers need to develop a new and complementary set of pipelines.

In Systems of Record the pipeline was called ETL (Extract, Transform, Load) and its role was to move data from operational systems to analytic systems, primarily for reporting and business intelligence.  In production it ran in batch mode and fed hard-wired reports and OLAP cubes in the data warehouse.  Batch mode meant that the latency of getting the analysis was days, weeks, or months – way too late to drive operational decisions in the transaction.

The pipeline was very difficult to change when business users wanted new information in their reports. The pipeline had to go through another development cycle so its agility was severely limited

Slide4
Figure 4: The new Analytic Data Pipelines are distinguished from their analog in Systems of Record primarily in speed. Analytics at runtime get applied near real-time. And the process of updating the predictive model is a continual process, not a periodic development effort.
Source: © Wikibon

Where Customers Should be Investing: Pipelines in Systems of Intelligence are all About Speed and Agility

In Systems of Intelligence the analytic data pipeline has to be faster on both counts: latency and agility.  The latency is much shorter – to the point where it has to return answers to its analytics as quickly as 10s of milliseconds.

On agility, rather than going through a development cycle that could last months, in its most advanced form the analytics will have to improve its predictions or prescriptions via continuous machine learning.  In other words, it will keep improving potentially as fast as new transactions come in.

Slide5
Figure 5: Wikibon has identified 6 key trade-offs that represent “budget items” for building Systems of Intelligence. They don’t operate independently. Tuning one changes others. But this methodology is an effective way for business sponsors and IT operations team to agree on SLAs required to support business value.
Source: © Wikibon

What Trade-Offs Customers Should Consider When Deploying Systems of Intelligence

The most important take-away is that building these systems involves a set of trade-offs.  You can consider the collection of trade-offs a type of budget for your project.  Customers can tune different “knobs” that make their choices appropriate for their objectives and their own capabilities.  The knobs or trade-offs aren’t completely independent. Tuning one affects others.  For example, agility, or the speed of improving predictions, affects the revenue or profit that comes from the accuracy of the predictions.

The choices that a Netflix makes are not likely to be the same as a mainstream Fortune 1000 company.  Choices involving required skills, development and operational overhead, and existing technology and processes will be very different as Systems of Intelligence move along the adoption curve and reach greater maturity.

Wikibon has identified 6 knobs or trade-offs.

  1. Accuracy of predictions: this corresponds to how much incremental revenue or profit you hope to achieve.  For example, how many additional fraudulent credit card transactions do you hope to achieve without creating so many false positives that customers stop using their cards.
  2. Speed of predictions: serving an ad means deciding how much a bid should be worth has to happen within a window of 10-20ms in order to leave time for the rest of the process.
  3. Speed of improving predictions: As new data comes in that serves as a feedback loop for the effectiveness of the predictions, machine learning can continuously improve the predictive model.

The last 3 trade-offs are the most actionable for mainstream enterprises.

  1. TCO/operational complexity: what a Netflix can manage in terms of the number of moving parts is not something mainstream enterprises are likely to be able to copy.
  2. Development complexity: it wasn’t that long ago that doing analytics in Hadoop meant using MapReduce, something precious few mainstream corporate developers could fathom.  Now there is a plethora of SQL on Hadoop databases to simplify the process. But building an analytic pipeline that involves streaming ingest, data prep and enrichment, filtering, and machine learning still takes a combination of tools that over taxes mainstream skills.
  3. Existing infrastructure – technology and skills: a LinkedIn that has the technical chops to develop and open source a messaging system like Kafka will be able to build Systems of Intelligence with greater self-sufficiency and with less mature tools than mainstream enterprises.

Having reviewed these trade-offs, let’s look at the customer journey.

Slide6
Figure 6: Representative applications that reflect the customer journey over time. The applications get more sophisticated as the technology matures and the skills required to build and operate these systems matures.
Source: © Wikibon

Planning Your Customer Journey: Skills and Platform Progress

Once an enterprise adds up the trade-offs they made in the last step, they can evaluate where they can start in the customer journey.  This depends primarily on two factors:

  • skills within the enterprise and…
  • maturity in the underlying technology platform.

 Over time, as the underlying technology platform matures, the trade-offs of today will be progressively less constraining and more advanced applications will be possible.  What’s accessible only to leading-edge Fortune 100 enterprises, such as fraud prevention, will be accessible to a far wider community with the maturity of technology and the diffusion of relevant skills.

AutonomicAutoPilot
Figure 7: Airplanes fly on autopilot. Someday our sophisticated applications will be able to manage themselves as well.
Source: © Wikibon

On the technology maturity side, for example, intelligent (autonomic) systems management today is not much more than a research project.  But at some point, it will be accessible to mainstream enterprises.  Machine learning and infrastructure for the Internet of Things have to improve greatly.

For systems management to work as near as possible to “lights out”, the machine learning process has to be able to observe all the end-to-end services and their infrastructure in operation.  At the same time, it also needs to figure out how everything is related.  It’s not enough to light up hundreds of alarms when something goes wrong.  Rather, over time it has to build up a predictive model of how the entire system should operate normally as a baseline.

This predictive model is the crucial piece that can take a blizzard of alerts and alarms and quickly pinpoint which one represents the cause of the failures that are cascading downstream and even back upstream and which are symptoms.  Then it has to be able to recommend a remediation or execute it itself.

Internet of Things technology also needs to mature and contribute a crucial building block.  A large scale system may have thousands of services and might be operating in many data centers or even controlling industrial equipment around the world.  The key part is that the some parts of the predictive model must operate out near the edges.  They could be in remote data centers or even in locations that capture and filter events going on inside a few devices out at the edge of the network.

 The intelligence at the edges has to identify anomalies so that not every event consumes bandwidth and time traveling back to the main data center.  This intelligence gets trickier, however.  Just what constitutes an anomaly changes over time.  The edge intelligence has to learn this in addition to the core predictive model in the main data center.

Trying to build this type of Intelligent Systems Management service is not realistic today.  But by explaining how it might work and some of the challenges, we can see how much technology has to mature.

It also sheds light on some of the limitations with today’s technology and why each organization has to evaluate its capabilities in order to make the trade-offs that are appropriate for their circumstances.

Action Item

Like any engineering product, Systems of Intelligence are all about trade-offs.  Our latest research seeks to highlight some key trade-offs for which organizations have to budget explicitly.  (See Figure 5 above).

The key take-away is that the business sponsors and the technical leaders responsible for implementing the new systems have to call out explicitly each of the trade-offs and make sure they are in alignment on the required value and necessary investment.  Time, money, and potentially success itself hinges on this alignment.

Further, each constituent must understand that tuning one trade-off can affect many others.  Some customers insist they need real-time predictions when 400 milliseconds would suffice.  That might not sound like much, but the whole technology stack might change, affecting other trade-offs such as operational and development complexity and the ability to leverage more of their existing infrastructure.  In fact, relaxing that speed might make the predictions more accurate, directly improving revenue or profit drivers.