This article is primarily for data practitioners who want to improve the quality of data in their databases and data warehouses, but also those whose work impacts data quality (looking at you, software engineers and sales people) and those whose jobs are affected by poor data quality.
The goal is to give you a framework for thinking about data quality metrics, a shortlist of metrics as well as a longlist, and a process for identifying which metrics your team should use. By the end of this article, you should leave with a sense of which metrics you should track to improve the quality of your data.
If a dataset lands in a warehouse and no one uses it, does it even matter? Data exists to be used, whether it is sales data for operationalization into a sales tool, product data for training a machine learning model, or financial data for decision-making with business intelligence (BI) dashboards.
The first requirement for data to be used is to, well, have data. The second requirement is to have literacy for working with data. The third requirement is to have trust in data. If your stakeholders do not trust your data, they will not only refrain from using it now, but can be turned off from data in perpetuity.