Exclusive: 88% of Organisations Use AI, But Most Are Building on a Broken Data Foundation

Michael Cade

By: Michael Cade

Tuesday, May 5, 2026

May 5, 2026

5 min read

So far, AI adoption has outpaced regulatory frameworks, leaving organisations largely to make up their own rules. But this lack of clarity hasn’t slowed organisations down. In fact, McKinsey’s latest survey found that 88% of organisations already report using AI in at least one business function. Despite this, innovation has slowed, and it’s become clear that organisations have overlooked a key enabler of safe and secure AI - data sovereignty.

by Michael Cade, Global Field CTO, Veeam Software

So far, AI adoption has outpaced regulatory frameworks, leaving organisations largely to make up their own rules. But this lack of clarity hasn’t slowed organisations down. In fact, McKinsey’s latest survey found that 88% of organisations already report using AI in at least one business function. Despite this, innovation has slowed, and it’s become clear that organisations have overlooked a key enabler of safe and secure AI - data sovereignty.

Simultaneously, regulation has begun to catch up, and much of it points to the same principles of data sovereignty and AI visibility. Take the EU AI Act, for example, which sets strict, risk-based rules on both AI development and deployment within the EU to improve AI visibility.

Rather than blindly charging ahead, organisations need to pause to develop transparent, traceable, and sovereign-by-design data architectures. Otherwise, they won’t just be unable to unlock the true potential of AI for their businesses; they’ll also fall behind on regulatory compliance.

Not all data is good data

As you might expect, both digital sovereignty and AI innovation boil down to data. It’s already well documented that AI needs a lot of data, and we’ve got plenty, with the IDC estimating that the global datasphere reached around 181 zettabytes annually in 2025. But, despite having plenty of data, Generative AI (genAI) pilots continue to fail widely. Some research suggests that as many as 95% of enterprise genAI pilots fail to reach production, or even demonstrate measurable ROI. The reason? Long-standing data hygiene issues.

Thanks in no small part to AI, data growth has become exponential, but organisations have largely failed to keep up. This influx has far outpaced storage processes, and organisations have somewhat taken their eye off the ball, with ‘junk’ data being stored alongside the ‘useful’ data required for AI usage. And ultimately, AI systems inherit not just the bias but also the quality and structure of the data they are trained on. So, if the training sets are poorly structured and include ‘junk’ data, outputs, and usability suffer.

There’s also a significant knock-on effect with compliance and regulation. While regulatory bodies are yet to agree on a unified approach to AI regulation, it’s already becoming clear that visibility will be central to future requirements. In Europe alone, the EU AI Act and the NIS2 Directive are already signalling a broader push for stronger governance, transparency, and control over operational and training data. And without strong sovereignty, organisations will remain unable to map and understand their data landscape to adhere to existing and future requirements.

Sorting the wheat from the chaff

After the last few years of data growth, the sheer scale of the workloads most businesses now hold can seem daunting. Before organisations can improve their data hygiene, they first need to understand and classify their data. Not just for what it contains, but also according to

how sensitive it is. A piece of data may be useful for a genAI pilot, but if it’s too sensitive, it cannot be used. This level of understanding not only avoids mistakenly giving genAI programmes sensitive data, but could also be key to creating genAI that delivers on its potential. Instead of training it on a pile of ‘useful’ data peppered with ‘junk’ data, organisations will be able to feed AI only the information it actually needs.

Once this is all in place and you know what you’re working with, organisations can begin to define the sovereignty requirements for each data bucket, including both regulatory and locality rules. For some, the knee-jerk reaction is to restrict usage to meet the strongest requirements of data localisation laws. Still, the EU’s GDPR, for example, doesn’t mandate localisation within a specific EU country, just to the European Economic Area (EEA), although it does place strict restrictions on the transfer of personal data outside the EEA – creating a ‘soft localisation’ effect in practice.

There’s a lot of nuance within this, which is why many organisations are adopting hybrid or multi-cloud architectures to maintain flexibility over where workloads are processed and stored. With these, organisations can restrict data where needed to meet localisation requirements, while still maintaining data portability, which will be essential as regulations continue to change. This flexibility and transparency allow organisations not just to monitor where their data resides, but who can access it - essential knowledge not just for compliance, but for security too.

Not just a tickbox

Up until now, data sovereignty has been relegated to the bottom of the priority list, seen mostly as a compliance exercise. Organisations have ticked it off, but only as part of a longer list of regulatory requirements, rather than considering it as a vital part of their data strategy. But if fully understood and wielded correctly, aligned with the wider business strategy, it can do much more.

Not only can it feed into the data governance frameworks that underpin operations, but it can also help inform and establish AI governance. With clean, structured, and classified data, organisations can finally unlock the true potential of their genAI pilots.

So far, data sovereignty has been underestimated, but with genAI innovation stalling and regulation catching up, organisations can’t afford to do so any longer.

Share this article