Premise. 

Like most markets experiencing maturation, the big data and analytics market is reorganizing around customer problems, moving away from a structure based on technology types.. Wikibon does not believe that the big data market is a “winner take all” market, but rather will be characterized by the insertion of big data solutions into the myriad of business processes where industry and workload expertise will be the key partnership decision factor.

with Ralph Finos, George Gilbert, and Jim Kobielus

The big data and analytic market didn’t see any dramatic changes in leadership in 2017. Some vendors did show growth in the mid-double digits with others coming in at half that. Wikibon believes these relative growth rates represent a gradual changing of the guard. Early vendors that represented the mix-and-match open source approach continue to have modest success taking their leading-edge customers into wide-scale production. But vendors of increasingly integrated platforms as well as semi-custom applications are growing faster and moving up the ranks. Our expectation is that we’ll see some bigger changes in leadership in next year’s 2019 report.

One of the core trends driving the big data and AI/ML markets is that what worked for sophisticated tech-centric firms is running into challenges with mainstream enterprises (see Wikibon’s Big Data Analytics Trends and Forecast report). We’re seeing that in the numbers (see Table 1 and Figure 1). The fastest growing vendors are ones that have 1) applications or semi-custom applications or 2) platforms that are growing more integrated end-to-end. One thing to note is that Table1 includes hardware, software, and professional services. So the AWS number, for example, includes the EC2 and S3 charges for the software running on that infrastructure. And Accenture’s semi-custom solutions are mostly professional services.

Table 1: 2016 & 2017 Big Data Revenue, Growth Rate, and Market Share by Leading Vendors

Vendors of applications or semi-custom applications include IBM’s industry solutions and Watson applications group, Splunk’s security and IT service management applications, and semi-custom applications from Accenture, and Palantir. The platform vendors who are building more end-to-end integration include Oracle, with its integration that enables its DBMS to query native and Hadoop or NoSQL data seamlessly from within the database; AWS, with the rapid growth in the number and integration of its big data services, including stream processing to ingest or analyze data in motion as well as integrating that data directly into its Redshift Spectrum or Athena for SQL access to an S3 data lake-like repository. SAP has only a handful of big data and ML applications but it has integrated its own version of Hadoop with the data management and advanced analytics in it Leonardo ML tools and HANA SQL DBMS. Microsoft’s growth as a provider of cloud services mirrors that of AWS; although it is still a good deal behind AWS. Cloudera is an interesting case. It started out as an Hadoop vendor but is trying to refashion its products as an integrated platform for machine learning. But their recent success appears to be coming from their early, leading-edge customers who are taking Hadoop into production and who are building out their deployments among that customer cohort. Dell is high on the list not just because its hardware but because it also includes VMware and Pivotal.

Figure 1: 2017 Leading Vendors Total Big Data and Analytics Share of Revenue, $34,900 million.

Big Data Market Dynamics

In our companion report, 2018 Wikibon Big Data and Analytics Worldwide Forecast Report 2017-2027, Wikibon discusses in detail the dynamics underlying the how enterprises are viewing big data in the context of what they are trying to achieve and the competitive landscape facing vendors of tools and solutions. Briefly, customers are deploying big data to achieve the following:

  • Achieving strategic outcomes. Customers will assure strategic business outcomes by converging on strategic BDA solution providers who deliver pre-built applications that incorporate best practices and are rapidly customizable to their unique requirements.
  • Reducing latencies. Customers will increasingly eliminate delays and bottlenecks throughout data, application, and business infrastructures by converging on internal pipelines that incorporate fewer discrete BDA products.
  • Streamlining processes. Customers will reduce complexities by converging on fewer, simpler, more consistent, and more automated BDA development and operations processes that span disparate pipelines, platforms, tasks, and roles.
  • Tightening controls and safeguards. Customers will tighten oversight and compliance by converging on unified governance tools that enforce security, policy, and other guardrails up and down the BDA stack and across disparate pipelines, platforms, and tasks.

Big Data Market Structure and Provider DNA

The challenge for big data providers is to help businesses get to these goals by eliminating the excessive complexity, cumbersome overhead, protracted pipelines, and tendency to gravitate to bespoke applications and solutions.

Determining how to best get to these ends is a significant challenge for users since the solution providers have a wide variety of offerings and strategies in their own DNA. The big data market is made up of a number of classes of company – all with approaches to the market.

  • Traditional database and solution software providers such as Oracle and SAP. These suppliers are racing to cross-sell all classes of big data software to existing customers – comprehensive solutions offerings, new big data-oriented offerings, and their vertical expertise.
  • Professional services providers such as IBM and Accenture. When it comes to solving complex digital business problems, enterprises often turn to experienced providers who have the IT-related skills and business know-how to solve critical IT and business-related problems.
  • Analytics and tools software providers such as Informatica, IBM, and SAS. Analytics and tools software providers are migrating their expertise to adapt their current offerings and developing specialized tools to accommodate big data requirements.
  • Traditional infrastructure hardware providers such as Dell and HPE. Traditional hardware suppliers bring storage and compute management skills – a key requirement for workloads characterized by high data volume, velocity, variety, and veracity challenges.
  • Public cloud providers such as AWS and Microsoft Azure. Public Cloud providers bring the merits of tool integration, scalability, low cost sand boxes, and proximity to cloud-based data to the market.
  • Rapidly growing big data software pure plays like Splunk, Cloudera and Hortonworks. Smaller big data software pure plays are collectively growing faster than the market overall as they address the myriad specialized requirements of big data handling that traditional tools are not meeting.
  • Hardware suppliers like Dell Emc and HPE. These vendors have – but their nature – a more narrow view of big data problem solving. However, hardware still matters and the right hardware solution could contribute to successful deployment of an effective big data workload.

Software Top 10

In terms of who to pay attention to in the big data market, there are 100s of players attacking the big data solution in different ways. While considering a large software company with market-leading offerings, a track record, and significant revenue is generally prudent, it may not provide the best solution for an enterprise’s unique big data requirements and strategy. However, if size and presence are a consideration, the largest software vendors bear a look (see Table 2 and Figure 2).

Table 2: 2016 & 2017 Big Data Software Revenue, Growth Rate, and Market Share by Leading Vendors

Figure 2: 2017 Leading Software Vendor Share of Software Big Data and Analytics Revenue, $10,800 million.

Among the top 5, Splunk (11% share) continues to be the largest pure play big data software company by a wide margin, shipping horizontal, broadly applicable applications, including security and IT service management. Oracle remains high on the list (9% share) because so much mission critical-data is in their flagship DBMS. As a result, it’s a natural extension for customers to be able to query big data in their Hadoop or NoSQL data stores. IBM’s position (6% share) mostly comes from its success with semi-custom applications from its industry solutions and professional services groups. Watson applications, especially those based on IoT, are among the most strategic solutions the company has taken to market in decades. SAP (5% share) has maintained steady growth by virtue of steady improvements to its platform including integrating its HANA DBMS with their version of Hadoop and introducing its Leonardo tools for machine learning and advanced analytics. Palantir (4% share ) continues to be one of the largest big data application vendors focusing on custom or semi-custom development work in the Federal Government.

Cloudera (3% share and growing rapidly) has been successful taking its early and most sophisticated customers into production and wider-scale deployment. AWS (3% share and growing) continues its triple digit growth as its big data and machine learning tools and services grow even faster than the overall platform. SAS (2% share) established the advanced analytics category for both tools and applications and has grown steadily in big data . Microsoft’s Azure (2% share) includes a growing range of services, branching out from HDInsight and the Azure Data Lake Store. They have been rapidly expanding their machine learning tools and cognitive AI services. Informatica (2% share) continues its steady transition from legacy ETL tools to more modern services for big data data integration, quality, and governance. Hortonworks (2% share) is having success taking their most sophisticated customers into production and broader deployments. They are investing heavily in developing streaming data capabilities distinct from the original Hadoop core batch analytics.

Hardware Top 10

When it comes to big data hardware, there are fewer choices – but again the leaders often attack the big data solution in different ways – public cloud, appliances, general purpose converged systems, and/or just plain compute, storage, and network offerings. If the enterprise has the skills to build it to serve as their own offering – e.g., hosters and cloud providers – the ODM path may be suitable as well. Hardware consumption on the part of SaaS and social media companies used for big data analytics would be included in the ODM figures. The hardware leaders in big data are indicated in Table 3 and Figure 3.

Table 3: 2016 & 2017 Big Data Hardware Revenue, Growth Rate, and Market Share by Leading Vendors

Figure 3: 2017 Leading Vendor Share of Big Data and Analytics Hardware Revenue, $10,500 million.

Services Top 10

Enterprises spend more of their big data budget on professional services than anything else. When it it comes to professional services there are 1000s of providers with different industry, regional, business model, delivery approaches, skill sets, and experiences. Enterprises should be looking for as good a match as possible to their tactical big data challenges as well as their long term strategies. While this market is much more diffuse than software and hardware, the vendors in Table 4 and figure 4 would be the leaders in volume of business executed.

Table 4: 2016 & 2017 Big Data Services Revenue, Growth Rate, and Market Share by Leading Vendors

Figure 4: 2017 Leading Vendor Share of Big Data and Analytics Services Revenue, $13,700 million.

Action Item. 

Many different vendors provide the multitude of functions, products, and services that make up the big data solution pipeline. Wikibon does not believe that the big data market is a “winner take all” market, but rather will be characterized by the insertion of big data solutions into the myriad of business processes where industry and workload expertise will be the key partnership decision factor. Therefore, the best fit for your enterprise’s big data entry point and/or long term strategy may not be a Top 10 provider. So, choose carefully and consider:

  • Product offering quality (i.e., reliability, support, stability, breadth and depth of offerings).

  • Track record for applying their tools and solutions to the specific business requirements for your workload and your industry.

  • Related business process compatibility (i.e. extensibility to related or adjacent applications and processes).

  • Incumbent relationships – especially where you have experienced success in related solutions in the past.

  • An individual enterprise’s risk:reward tolerance or culture for learning (i.e., what’s your appetite for taking a risk, making a mistake, and then recovering or starting over?).

Appendix A: Vendors with Big Data Revenue >$50M

Appendix B: Selected Vendors with Big Data Revenue <$50M