It goes without saying that data is critical to make strategic decisions, to run operations, and to perform business functions.
- Healthcare companies derive analytics from clinical and claims data to meet quality measures, improve care, and better manage high-cost and high-risk populations.
- Manufacturing companies rely on performance data to improve efficiency, increase yields, and lower costs.
- Retailers rely on data to predict trends, forecast demand, and optimize pricing.
- Financial services organizations perform advanced data analytics to drive revenue and margins through operational efficiency, risk management, and improved customer intimacy.
All of these scenarios require vast amounts of data. Regardless of industry or company size, nearly every business is relying on gathering and leveraging data. Being a data-driven organization is an absolute necessity to gain a competitive advantage.
IT is uniquely positioned to have access to a comprehensive set of data which is stored on or passes through the company’s infrastructure. IT, therefore, carries a responsibility to provide end users access to this data, and to play a vital role in its effective use.
How often do you hear about business users creating their own spreadsheets, spending countless hours consolidating data from various sources, fixing errors, deduplicating, replacing, and further engaging in manual and non-extensible efforts? And what about those reports that show almost the full picture, using almost all of the available data? IT has an opportunity to deliver leadership and guidance around data management within the organization.
Is Your Data Drinkable?
Key challenges to becoming a data-driven organization and achieving actionable insights are related to the types of data, sources of data, and most importantly, quality of data.
Incomplete, missing, outdated, duplicate, and inconsistent data (including typos and spelling mistakes) all create data quality issues. The number one cause of inaccurate data is human error, followed by the lack of internal resources, and an inadequate data strategy.
When thinking of the steps required to access and use data, consider the parallels with a water treatment system:
The EPA sets legal limits on over 90 contaminants in drinking water. The legal limit for a contaminant reflects the level that protects human health and that water treatment systems can achieve using available technology. Raw and untreated water is obtained from an underground aquifer (usually through wells) or from a surface water source, such as a lake or river. It is pumped to a treatment facility where debris and disease-causing microorganisms are removed. When the treatment is complete, water flows out into the community through the distribution system.
In a similar fashion, organizations will obtain data from multiple sources, process it by filtering and cleaning the data until a specific threshold of quality output is achieved. The cleansed data is filtered for reports or other outputs. It is fit for consumption in a specific setting or scenario.
For this “data treatment” system to work, organizations need to determine which data they need and how to make it usable.
The effort to improving data quality should never start with just “cleaning” the data; instead, a comprehensive approach is required for organizations that are serious about leveraging data on continuous basis.
- Becoming a Data-Driven Organization: A 5-Part Framework for Sustainable Data Quality
- Improving Data Quality: The 3 Core Functions of a Data Cleanse