Blog

The 65% gap: where your data security risks actually live

January 21, 2026
4 min. Read
Sun Lee
Sun Lee
Chief Marketing Officer

The 65% gap: where your data security risks actually live

January 21, 2026
4 min. Read

Your DSPM tool just finished scanning.

It found 50,000 sensitive data assets. Classification accuracy: 95%. Scan time: 4 hours. Great numbers. But here's what the report doesn't tell you. While your tool was scanning, 65% of your actual data security risks were happening somewhere else.

Your scanner was looking at parked cars. The real action was on the highway.

Where data actually lives

Security teams think of data as stuff in databases. That's where scanners look. That's what they find.

But in modern environments, data at rest is only part of the picture. Data also lives in API calls between microservices. In CI/CD pipelines. In ETL jobs moving records between systems. In AI training pipelines and model outputs. None of this shows up in a typical scan.

A 24/7 Data Defense Engineer sees all of it. Not just where data sits, but where it goes. Every hour of every day.

Breaking down the 65%

Here's where sensitive data actually flows in cloud-native organizations:

Cloud storage and databases: ~35%. This is what scanners see.
API calls and microservices: ~25%. Scanners see none of this. Data passes between services constantly. A single customer record might touch 50 services before lunch. Your Data Defense Engineer tracks every hop.
Data pipelines and ETL: ~15%.
Scanners might see source and destination. They miss everything in between, including temporary tables with unmasked PII that exist for two hours and disappear. Your engineer watches the entire journey.
AI and ML systems: ~15%.
Training data, prompts, embeddings, model outputs. Completely invisible to traditional tools. Your engineer follows data into and out of every model.
Third-party integrations: ~10%.
Data flowing to vendors and partners. Mostly unknown to scanners. Your engineer tracks what leaves your environment and where it goes.

Add it up. 65% of sensitive data touchpoints are invisible to traditional DSPM. But not to a 24/7 Data Defense Engineer.

Real examples

The API leak. A developer builds an internal API that returns customer records. It works great. It's also accessible without authentication due to misconfiguration. The scanner sees the database. It flags the PII. It doesn't see the API wide open. Your Data Defense Engineer sees the API. It sees unauthenticated requests hitting the endpoint. It flags the exposure and can restrict access automatically.

The AI training data. Your ML team builds a model. They pull data from your warehouse, which the scanner knows about. They also pull enrichment data from a third-party service. The scanner never sees the external data. The model learns from it anyway. Your Data Defense Engineer tracks data flowing into the training pipeline from every source. It maps the complete lineage. It flags when sensitive data enters without proper governance.

The logging accident. A developer adds logging to debug a production issue. The logs capture request payloads with customer PII. They ship to your logging platform. They're searchable for 90 days. The scanner sees the production database. Not the logs. Your Data Defense Engineer watches data flow into the logging system. It detects PII in log entries. It can mask sensitive fields or alert your team before the data spreads.

Why scanners can't fix this

This isn't a feature gap. It's architectural. Scanners crawl data stores. They answer: "What sensitive data do we have and where is it stored?"

Wrong question.

The right question: "How does sensitive data move through our systems, 24 hours a day, 7 days a week?" You can't scan your way to that answer. You have to track data flows continuously. You need Data Journeys mapped in real time by an engineer that never stops watching.

Closing the gap

First, acknowledge the limit. If you rely on periodic scans, you're seeing 35% of your risk.

Second, map what you're missing. Follow a customer record from creation to deletion. Count how many systems it touches. Compare that to what your scanner reports.

Third, deploy a 24/7 Data Defense Engineer. One that tracks data in motion across APIs, pipelines, AI systems, and third parties. One that works while you sleep.

The 65% won't secure itself. But your Data Defense Engineer will.

Your DSPM tool just finished scanning.

It found 50,000 sensitive data assets. Classification accuracy: 95%. Scan time: 4 hours. Great numbers. But here's what the report doesn't tell you. While your tool was scanning, 65% of your actual data security risks were happening somewhere else.

Your scanner was looking at parked cars. The real action was on the highway.

Where data actually lives

Security teams think of data as stuff in databases. That's where scanners look. That's what they find.

But in modern environments, data at rest is only part of the picture. Data also lives in API calls between microservices. In CI/CD pipelines. In ETL jobs moving records between systems. In AI training pipelines and model outputs. None of this shows up in a typical scan.

A 24/7 Data Defense Engineer sees all of it. Not just where data sits, but where it goes. Every hour of every day.

Breaking down the 65%

Here's where sensitive data actually flows in cloud-native organizations:

Cloud storage and databases: ~35%. This is what scanners see.
API calls and microservices: ~25%. Scanners see none of this. Data passes between services constantly. A single customer record might touch 50 services before lunch. Your Data Defense Engineer tracks every hop.
Data pipelines and ETL: ~15%.
Scanners might see source and destination. They miss everything in between, including temporary tables with unmasked PII that exist for two hours and disappear. Your engineer watches the entire journey.
AI and ML systems: ~15%.
Training data, prompts, embeddings, model outputs. Completely invisible to traditional tools. Your engineer follows data into and out of every model.
Third-party integrations: ~10%.
Data flowing to vendors and partners. Mostly unknown to scanners. Your engineer tracks what leaves your environment and where it goes.

Add it up. 65% of sensitive data touchpoints are invisible to traditional DSPM. But not to a 24/7 Data Defense Engineer.

Real examples

The API leak. A developer builds an internal API that returns customer records. It works great. It's also accessible without authentication due to misconfiguration. The scanner sees the database. It flags the PII. It doesn't see the API wide open. Your Data Defense Engineer sees the API. It sees unauthenticated requests hitting the endpoint. It flags the exposure and can restrict access automatically.

The AI training data. Your ML team builds a model. They pull data from your warehouse, which the scanner knows about. They also pull enrichment data from a third-party service. The scanner never sees the external data. The model learns from it anyway. Your Data Defense Engineer tracks data flowing into the training pipeline from every source. It maps the complete lineage. It flags when sensitive data enters without proper governance.

The logging accident. A developer adds logging to debug a production issue. The logs capture request payloads with customer PII. They ship to your logging platform. They're searchable for 90 days. The scanner sees the production database. Not the logs. Your Data Defense Engineer watches data flow into the logging system. It detects PII in log entries. It can mask sensitive fields or alert your team before the data spreads.

Why scanners can't fix this

This isn't a feature gap. It's architectural. Scanners crawl data stores. They answer: "What sensitive data do we have and where is it stored?"

Wrong question.

The right question: "How does sensitive data move through our systems, 24 hours a day, 7 days a week?" You can't scan your way to that answer. You have to track data flows continuously. You need Data Journeys mapped in real time by an engineer that never stops watching.

Closing the gap

First, acknowledge the limit. If you rely on periodic scans, you're seeing 35% of your risk.

Second, map what you're missing. Follow a customer record from creation to deletion. Count how many systems it touches. Compare that to what your scanner reports.

Third, deploy a 24/7 Data Defense Engineer. One that tracks data in motion across APIs, pipelines, AI systems, and third parties. One that works while you sleep.

The 65% won't secure itself. But your Data Defense Engineer will.

You may also like

DSPM tools are scanners, not engineers: why the distinction matters

January 16, 2026
DSPM tools are scanners, not engineers: why the distinction matters

2026's Top AI Security Challenge: The Physics of Data Flow

January 14, 2026
2026's Top AI Security Challenge: The Physics of Data Flow

Join Relyance AI at RSAC™ 2026 Conference: Why static DSPM is failing and what comes next

January 12, 2026
Join Relyance AI at RSAC™ 2026 Conference: Why static DSPM is failing and what comes next
No items found.
No items found.