Managing a Virtual Flash World, where Snapshots are King and Knave

Premise

Cataloging and automated policy management are the key enablers of a virtual flash world, where storage snapshots are both King and Knave. Combining cataloging and automated policy management is the only solution to enabling storage copy reduction in harmony is risk management and compliance. This enables and justifies an all-flash data center, enables  data to be available quicker to the business and other IT functions, and drives greater business and IT productivity and responsiveness.

Introduction

For active data, the future is flash storage. The main advantages of flash are greater IO speed and greater IO density. The main benefits are better application performance, better end-user productivity, and much lower OPEX operational costs. Figure 1 shows the projection for the adoption of flash for active latency storage revenue.

 

Figure 1: Wikibon Latency Storage Revenue Projection by HDD and Flash, 2012-2026Source: © Wikibon Server SAN & Cloud Research Projects 2015
Figure 1: Wikibon Latency Storage Revenue Projection by HDD and Flash, 2012-2026
Source: © Wikibon Server SAN & Cloud Research Projects 2015

One of the major ways that flash storage is made more cost effective is by exploiting to the full the IO density of flash to share data. The key capability that flash has over HDD is the ease of making logical copies with snapshots. Instead of the clones required by HDD disks that can take hours or days to make, logical space-efficient application or crash consistent copies can be snapshotted in seconds.

Snapshots are King in a virtual flash world, but snapshots gone wild brings its own risks and challenges. Where are the snapshots? Which one is the latest? Which snapshots can be purged? If data is physically deleted, archived and accidentally destroyed, which logical copies could be destroyed? Who owns these logical copies? Has flash storage sharing been optimized?

The Copy Nightmare

Figure 2: Data Copy and Management IssuesSource: © Wikibon Server SAN & Cloud Research Projects 2015
Figure 2: Data Copy and Management Issues
Source: © Wikibon Server SAN & Cloud Research Projects 2015

A traditional data workflow might look something like this:

  • A production system with multiple databases runs on a tier 1 storage platform.
  • Snapshots of data are taken to minimize recovery time, and improve RPO SLAs (reduce the amount of data that can be lost to, for example, a software incident).
  • Full backup data copies are made every week.
  • Clones of certain databases are made for downstream applications in the application suite.
  • Clones of certain databases are made for the data warehouse applications at the end of day and end of month.
  • Clones of certain databases or partial databases are made for the application development team over a weekend or month-end, and subset copies made for the application development team.
  • A full production copy is made for the quality assurance development team over a weekend or month-end.

At the end of this and other very normal processes, about 10-15 copies of the data are made within an average data center. Management of these copies usually relies on spreadsheets and searching. Figure 2 shows the challenges of multiple copies of data, and the requirement for a catalog to orchestrate and automate snapshot and data copy management.

Modern Optimal Solution

An optimal solution data workflow might look something like this:

  • A production system with multiple databases runs on flash storage in an all-flash data center.
  • Snapshots of data are taken and cataloged every 15 minutes to minimize recovery time, and improve RPO SLAs (reduce the amount of data that can be lost to, for example, a software incident).
  • Snapshot data copies are made every week, and the deltas sent to another copy onsite and offsite.
  • Snapshots of data required for downstream applications in the application suite can be made instantly and made fully available in seconds to the down-steam applications. The exact data flow is cataloged and can be accessed programmatically.
  • Snapshots of all the complete databases can be made available to the data warehouse applications at any time or multiple times a day, and the provenance of the data cataloged and made available to data warehouse end-users.
  • Snapshots of the complete consistent databases are made, the databases scrubbed and cleaned for development, and instant full snapshots of the complete application environment published to all members of the application development team in minutes or an hour, and a full record is available in the catalog.
  • A snapshot of the application environment is made on-demand for the quality assurance development team at any time, with a full record available in the catalog.

At the end of this and other very normal processes, about 3 physical copies of the data are held and managed within an average data center. Management of the logical copies from the physical copies is fully automated, ensuring provenance and compliance.

Benefits of Optimal Solution

The benefits to the business and IT are immense.

  • The movement of data has been minimized to that required for onsite and offsite business continuity.
  • The quality of the data to the downstream applications has gone from hours/days old to instantaneously available, with a full record in the catalog.
  • Performance of the downstream applications is enhanced by the use of flash storage.
  • The quality of the data in the data warehouse has gone from weeks/month old to instantly available, with provenance programmatically available from the catalog.
  • Data warehouse reporting performance is enhanced by the use of flash, reducing time to action from weeks to minutes.
  • Every member of the application development team has access to a full compliant copy of the latest application environment, and can full test updates to the application suites. Studies have shown a 50% improvement in productivity and 50% reduction in time to improved application value from improved quality, quantity and improved productivity from the use of flash storage.
  • In one study, the rejection rate for submissions from the application development team to quality assurance went from 40% to 4%, because the application team and quality assurance teams were both using complete and up-to-data versions, and full testing could be made easily further up the value chain.
  • The number of copies of data goes from 10-15 to 3-4.
  • Many more snapshot copies of the data can be made for better utilization of the data and new applications
  • The use of the catalog enable automation of the workflows and data management.
  • Full provenance of data is available programmatically to all down-stream applications and data warehouses.
  • Because of the performance benefits of all-flash storage, operational teams can be flattened and integrated, with better alignment with the application and business teams.
  • The amount of storage is reduced significantly (by a factor of 4 to 5), which justifies the use of much more performance all-flash storage.
  • Providing provenance and compliance data for audits is an automated process.
  • The risk of loss of data is reduced by integrating of the catalog into every aspect of the data workflow.

Implementation Challenges

The major implementation challenge is that such a workflow, together with sharing data access across multiple production databases, goes against every instinct of current HDD storage and operational practice. Because of the performance limitations of HDD-based storage, current practice is to clone/copy the data. Current practice is that production databases need tier 1 storage, and downstream application should use separate copies on tier 2 storage. Copies are “good”, and ensure performance. The mindset is that flash should be used as a cache in front of each copy of data, instead of all active data in flash.

Cataloging software is not yet fully functional and available across all storage platforms and snapshots. The leading contenders are Catalogic (fully integrated with VMware, containers, NetApp and IBM storage), and the IBM Spectrum Suite (aka TSM).

Action Item

CIOs and senior management should create a small team of the best and brightest, create an optimized all-flash virtual environment with a programmatically  integrated catalog in a subset of the datacenter, and demonstrate the practicality and benefits of this environment to the business and IT.

 

Print Friendly, PDF & Email

Research Agenda Infrastructure Transformation