4. PII scan

Syntho provides a built-in PII scanner to help you identify columns that may contain personally identifiable information (PII) in your datasets. This step is essential to ensure data privacy compliance and to prevent accidental exposure of sensitive fields.

You can run a shallow scan (faster, based on metadata) or a deep scan (more accurate, inspects data contents using NLP).


Interactive guide: How to start the PII scanner

Follow the interactive guide below to run your first PII scan


Scan modes explained

Scan Type
Description
Speed
Accuracy

Uses column names and regex rules to infer PII

Fast

Medium

Analyzes actual data content using NLP models (for string/text columns)

Slower

Higher (but possibly more false positives)

What to watch for

  • Red PII column headers: These columns are flagged but not yet handled (e.g., no Mocker or Mask).

  • Exclamation mark (!) next to table name: Indicates at-risk PII columns are still in Duplicate mode.

To resolve

Last updated

Was this helpful?