The business world has a big problem—bad data is wreaking havoc on operations, and as AI gets more deeply ingrained, the risks are mounting exponentially. Problems with bad data are already taking their toll on the bottom line, and in more ways than one, reveals new research from data observability firm Revefi.
The firm’s survey of more than 300 IT directors, data and analytics managers, and other IT professionals finds that 40 percent encounter 11 to 100 data incidents per month, 65 percent said bad data delays processes, and more than 75 percent said that it is somewhat, very, or extremely difficult to manage data warehouse spend, which is especially problematic right now as companies are working to do more with less.
“Data quality and management issues are on the rise, and that’s a costly problem for businesses,” said Sanjay Agrawal, CEO and co-founder of Revefi, in a news release. “It leaves them spending too much time manually identifying root causes of data issues, and it creates delays, wastes money, leads to poor decision-making, and reduces customer trust and the accuracy of AI models.”
“Every business wants to be data-driven, yet there’s a lack of trust in data,” added Shashank Gupta, CTO and co-founder of Revefi, in the release. “Data teams typically lack the tools that they need to understand if they have a data problem and identify root causes quickly to fix issues and move their businesses forward.”
Organizations across sectors spend far too much time finding and resolving data incidents
More than a quarter of those surveyed said detecting most data incidents takes up to eight hours. A tenth of the group said identification can take days or even more than a week. In addition to the time it requires, manually identifying problems is also enormously resource-intensive, as finding root causes often requires the involvement of multiple people across teams.
That’s just uncovering the source of the problem
Then you need to fix it—yet 43 percent of survey respondents said it takes more than 48 hours to resolve a data incident after discovering it.
Half of the respondents from the manufacturing sector and 60 percent of IT professionals in education admitted that it takes them more than 48 hours to resolve data incidents. A full 100 percent of respondents in energy, oil & gas said they typically must dedicate more than 48 hours to resolve a situation stemming from bad data.
Bad data and other data challenges like cost management can have adverse consequences
Fifty-eight percent of IT professionals surveyed said that data quality and cleanliness are the most significant challenges that they face when working with data. Nearly as many (57 percent) revealed that they have encountered inaccurate data.
The same number (57 percent) said bad data has led to poor decision-making. Nearly as many (56 percent) said they believe that bad data reduces the accuracy of AI model performance—which is particularly concerning considering the very high usage of AI models that has occurred in recent months.
Half of respondents said that managing their data warehouse spend is difficult. Bad data also can erode the data users’ trust and work against company efforts to build a data culture. Indeed, 40 percent of respondents said that they believe bad data reduces customer trust.
Data quality is critical to AI model training and to ensure ethical AI development
As the adoption of AI grows and more organizations rely on AI models to automate more decisions and processes, the need for high-quality data takes on even greater importance.
That said, it’s troubling that 43 percent said they have experienced negative consequences due to poor data quality in AI projects. It’s also concerning that more than half (52 percent) of IT pros only somewhat trust the data sources that are being used to train AI models.
But there’s also some good news here—a significant majority (70 percent) of IT professionals believe that addressing data quality issues is important from an ethical standpoint in AI development.
Read Revefi’s report, 8 Data Issues and How to Solve Them, here.