Clinical data management

Data Explosion - What’s Next in Test Data Management?

Over the years, we have seen test data set-up transform from a small side effort to a key TDM (Test Data Management) supply chain solution. Many factors have played a critical role in this advancement, namely:

  • Data privacy concerns around personally identifiable (PII) and protected health (PHI) information
  • Heterogeneous application interfaces
  • New servers and appliances
  • Reduced storage costs
  • High data synchronization costs
  • Agile processes
  • Business complexity
  • New databases
  • Emerging technologies

With the rise of social media platforms, IoT devices, autonomous cars and cybersecurity, the volume and variety of data generated every day amounts to a massive data explosion.

Recommended reading: How Much Data is Generated Each Day?

In today’s culture of immediacy, enterprises must be ready to meet new age testing demands with the right set of people and technologies. With all that said, one must fully understand the cause before looking to find a solution.

What is driving the demand for test data in an enterprise?

Fundamentally, there are three vectors which drive demand and variety of data, as the enterprise progresses. They are commonly known as the “3 P’s” of an enterprise: platforms, programs and projects as shown below in Fig. 1 below.

Innovative solutions are the backbone of today’s enterprises, helping grow revenue and market share in a competitive environment. New programs are begun every day to address the needs of customers, business partners, regulators and others.

Each of these programs are managed through projects, which give birth to new or enhanced platforms, applications and products. An increase in any one of these vectors requires a new class of test data in order to build a quality product.

Successful product development depends on good quality control and assurance processes. The availability of proper test data is a key component of testing, where 15% to 25% of the project cost may be spent creating test data that can be synchronized across participating applications. New product development practices will need test data at such a scale that even self-service and on-demand fulfillment will not be able to keep up with the volume and speed required.

In 3 to 5 years, I envision a transformed system where test data will be ubiquitous. An automated operation will fulfill data needs with a “pull” model powered by robust machine learning algorithms — rather than today’s “push” model.

This will be enabled by predictive models, Artificial Intelligence and natural language processing (NLP), and recursive automation. In short, the way forward will be to integrate the brain of Netflix to identify, and the speed of Amazon Prime to fulfill.

In our next installment, we will look at some of these techniques.