Formerly known as Wikibon
Search
Close this search box.

Wikibon's 2018 Data Scientist and Edge Analytics Predictions

Premise

Explosive growth in AI-enhanced capabilities will shift data science work from data scientists wrangling data and coding manual experiments to shepherding the output and results of intelligent software and likewise pushing advanced analytical work down the corporate food chain. At the same time, meg-AI initiatives like IBM Watson are finding themselves obsolete as their best AI tools are rapidly being duplicated in the open source community.

The fascination with AI and the edge in their many forms is catalyzing an array of new enterprise expectations and behaviors, some of which are working, many of which are not. 2018 will be a year of modest edge and AI returns and a lot of learning in most enterprises. Nonetheless, a number of success patterns are starting to emerge, including:

  • Embedded AI first; then general-purpose AI (maybe).
  • Data science tooling improves, but not so fast that data scientist doesn’t remain crucial.
  • Network costs remain a central variable in distributed edge and AI applications.
  • AI and edge will be a multi-source proposition; IBM Watson, for example, cannot succeed as a single-source provider of AI services.

Wikibon

Prediction

The application of advanced analytics, data science and even machine learning is rapidly becoming commonplace as embedded functionality in both operational and analytical software. Salesforce will be the leader in embedded AI and AI toolkits in 2018, pushing its overall revenue beyond $11b

Salesforce’s Einstein tools will be used initially for predictive models on data within Salesforce applications. Typically this involves customer data, Chatter, email, and e-commerce activity, but during 2018 we will see an expanding use of Einstein through Salesforce’s App Center/App Exchange for other uses, in particular, IoT signals.

Wikibon Prediction The general availability of ML and AI tools will tend to demystify the practice and also deprecate the lofty position that data scientists occupied for a short time. However, organizations engaged in true data science and research & development of fundamental AI algorithms will drive the progression of the discipline. The 80% data scientist (80% wrangling data, 20% doing data science) will see a near reversal in 2018.

The demand for advanced analytics of all kinds, especially using big data, quickly outstripped the supply of data scientists qualified to model and test quantitative models. The market responded with better tools partially automating wrangling and modeling. It will take time for this changing landscape to take effect, but by 2020, we see the role of data scientist diminishing.

Wikibon Prediction IoT buzz will reach the realization that network costs are 3-5X greater than hardware costs. In fact, IoT hardware costs will be a secondary consideration in building out an instrumented network. Cisco, IBM, HPE, Dell and other hardware providers will not be differentiators. In 2018, we see Amazon and Google as leaders in providing communication alternatives to the existing common carriers, with a build-out of the networks by 2020

IoT data reduction at the “edge” by normalizing, aggregating or other data elimination methods is the typical response to the enormous costs of transmitting all data back to the big data repository. However, IoT value can’t be separated from “edge analytics” which are complex to develop using data science , ML, AI and vast amounts of data, not samples. The models have to be rendered into compact algorithms expressed as rules, scoring, recommendations and other methods. Concerns about minimizing current transmission costs while maximizing the value of options on derivative uses of data will be one of the catalysts that focus attention on — and accelerate efforts to — establishing conventions and rules for data assets.

Wikibon Prediction IBM will (internally) reassess its cognitive computing strategy given the dismal performance of Watson. In fact, it may already have. There hasn’t been a new release for Watson since 2011. In 2016, Watson had about 500 customers of over 8000 forecast; 300+ business partners of a projected 4000+.

IBM made a bet that it could replicate it’s historical “single-source” relationship in the AI world. Why? Because the quality of AI-based applications is sensitive to data volume and IBM figured it could accrete data faster than other possible suppliers. However, IBM doesn’t have a consumer business, like Google and Amazon, and that has undermined its strategy. As a result, IBM’s Watson business has disappointed.

While it is arguable that Watson was or wasn’t to blame for the very public failure at MD Anderson, other installations are finding its capabilities and system integration requirements onerous. Training costs are enormous and perhaps its greatest flaw is that having spent considerable time and cost ($50M+ in the case of MD Anderson), training Watson in a corpus such as a type of leukemia, Watson is unable to draw inference across any other corpus, such as coronary artery disease. At the time, IBM engineers developed novel and powerful AI capabilities for Watson, but similar capabilities are already available as open source.

Action Item

Algorithms derived from machine/deep leering can be easily duplicated once deployed. Your competitive advanced is to build the data architecture to facilitate the rapid creation of AI value, and that requires lots of data. Avoid “managing for scarcity,” do not discard data that appears on the face of it to be unnecessary.

You may also be interested in

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content