Skip to main content

Static vs Dynamic Data Masking: When to Use Which

Tianzhou · May 25, 2026

Before getting into static vs dynamic, the thing both kinds of masking share: they hide sensitive columns so the wrong people don't see real PII. Where they differ is the moment they act. Static masking rewrites the data once, into a sanitized copy. Dynamic masking rewrites the query result at read time and never touches the source. That single difference decides almost everything else, and picking the wrong one is how you either leak real PII into a test environment or fail to control who reads production.

Static data masking

Static data masking (SDM) produces a sanitized copy of a dataset. A job reads production, transforms the sensitive columns (substitute, shuffle, hash, nullify), and writes the result somewhere new. The masking is baked into the bytes. There is no path back to the original values.

The whole point of SDM is to get data out of the production boundary safely. Dev, test, staging, demos, training, CI fixtures: anywhere the people using the data have no business seeing real values. Because the copy holds no real PII, you can hand it around freely.

Two things follow from masking the data itself. First, the protection rides along with the data. A backup or a replica of a masked dataset is also masked, so you can copy it anywhere and it stays safe. Second, you pay the cost once, at copy time. Queries against the copy run at full speed with no per-query overhead.

The downsides are structural too. The copy is stale the moment it's made, and refreshing means re-running the job. And the masked output still has to preserve format and referential integrity, or the test data is useless. A masked SSN still has to look like an SSN, and foreign keys have to line up across the masked tables.

Dynamic data masking

Dynamic data masking (DDM) masks at read time. The stored data is left alone. A policy on a column decides what each principal sees, and the engine (or a layer in front of it) rewrites the result for anyone who lacks the right role.

DDM is built for production. Same live rows, a different view per role: an analyst sees a partial mask, a DBA sees cleartext, a contractor sees a full mask. Grant the right role and the same query hands back the real value. The data is always sitting there; access decides what surfaces.

The same two properties invert. The transform happens at query time, not at rest, so a backup or a replica holds the real values. Masking does not travel with the data, which means every read path has to enforce the policy or that path leaks cleartext. And you pay the cost per query, because masking runs on every SELECT. A heavy policy adds latency to each one.

The main risk falls right out of this: a privileged role, or a direct connection that goes around the masking layer, sees cleartext. Your coverage is only as good as the read paths you actually enforce.

Side by side

Static data maskingDynamic data masking
When appliedOnce, at copy timeEvery query, at read time
Data at restPermanently altered (a copy)Unchanged
ReversibleNoYes, by role or grant
EnvironmentNon-production (dev, test, staging)Production
Backups & replicasMasked, protection travels with the dataCleartext, the transform is query-time
Performance costPaid once, at copy timePaid on every query
Primary useSafe test dataRole-based access control
Main riskStale data; broken referential integrityBypass via privileged role or direct connection

The row teams miss most often is backups. Static masking protects the copy; dynamic masking does not. So if the requirement is "no real PII in this dataset," dynamic masking alone does not clear that bar. The data at rest is still real.

Tools and solutions

Static masking runs as a batch job, so the tooling lives in the test-data and data-pipeline world: Perforce Delphix, Informatica Persistent Data Masking, IBM InfoSphere Optim, Oracle Data Masking and Subsetting Pack, and Tonic.ai.

Dynamic masking runs at read time, so it lives in the engine or a layer in front of it:

  • Native engine DDM: SQL Server, Oracle, Snowflake, and BigQuery ship it; MySQL and PostgreSQL do not.
  • In front of the engine: Bytebase on the query path, and access-governance platforms such as Immuta and Satori.

When to use which

Use static when data leaves production. Refresh a lower environment, hand a dataset to an offshore team, build CI fixtures, record a demo. The consumers should never be holding real values in the first place.

Use dynamic when data stays in production and different users need different views of the same rows. That's role-based access to live data.

And honestly, the common end state is both. Refresh staging from production with static masking, then run dynamic masking on the production queries that remain. They solve different problems, so you don't get to pick one and call it done.

The canonical static workflow is the non-prod refresh: rebuild dev, test, or staging from production on a schedule, masking sensitive columns during the copy so no real PII lands downstream. Preserve formats and relationships so the data still exercises the code.

In a regulated estate the two run side by side off the same production database. Dynamic masking protects the live production reads. A scheduled static-masking job feeds safe data to every non-production environment.

Loading diagram…

Where Bytebase fits

_

Bytebase does dynamic masking, not static. Bytebase Dynamic Data Masking governs the human query path: queries route through the SQL Editor, and results are masked before they leave it, by policy. One policy model covers every engine in the fleet, including MySQL and PostgreSQL, which ship no native DDM of their own.

Worth being precise here, because it's the honest boundary: Bytebase masks the reads that flow through it, the human and agent ad-hoc query path. An application that connects straight to the database with its own connection string bypasses Bytebase and sees cleartext. Native engine DDM enforces inside the engine, so it covers every connection including the app's, but only that one engine. Pick the control that matches who you're actually guarding against.

Native DDM, where it exists, varies by engine, and MySQL and PostgreSQL don't have it at all. Bytebase applies the same policy across all of them.

After all, masking is a policy question, not a storage one. The only thing static vs dynamic decides is when you enforce it.


Try Bytebase Dynamic Data Masking with this tutorial.

Further Readings

Back to blog

Explore the standard for database development