Use Cases & Configuration

Start with your goal. Then pick the first use case to implement. Each use case optimizes for a different outcome. For example,. testing, realism, privacy, consistency, or speed. Each has different configuration requirements.

Getting started

Before you dive into a use case, make sure the basics are covered:

Generation approaches

  • AI-generated synthesis: best when you need statistical utility with strong privacy, or extra rows.

  • Rule-based generation: best when values must follow explicit business logic.

  • Masking / de-identification: best when you need format-preserving replacements and stable keys/relationships.

  • Hybrid: best when one approach alone does not meet your requirements.

Key configuration decisions

These decisions drive most success (and most rework).

1) Pick the workspace mode that matches your starting point

  • De-identify: you already have a production-like dataset and mainly need to replace identifiers.

  • Mock or mask all: you need “production-like” formats but you don’t want to keep original values.

  • Mock all: you have little/no source data and want to generate everything from scratch.

  • Synthesize all: you have enough rows and want maximum statistical utility with strong privacy.

2) Decide if you should reshape to a single entity table

AI synthesis works best on a single table. It is often worth creating a SQL view first (especially for ML, analytics and data sharing).

3) Choose masking, rule-based, and AI synthesis

  • Use masking when downstream systems validate formats (emails, IBANs, UUIDs).

  • Use rule-based / calculated columns when the business logic must always hold (profit = revenue - costs).

  • Use AI synthesis when you need privacy + statistical utility for indirect identifiers (age, gender, weight).

Governance, compliance, and automation (reference)

Use cases

chevron-rightBaseline workflow (applies to every use case)hashtag

Use this checklist to go from “use case” to a repeatable job.

PrerequisitesConfirm access, schema alignment, and environment readiness.PrerequisitesCreate a workspacePick the source + destination, then choose a workspace mode that matches your starting point.Create a workspaceWorkspace modesConfigure generatorsStart from the simplest approach that meets the goal.Introduction to data generatorsGeneratorsHandle keys and relationships (relational schemas)Make FK behavior explicit before your first big run.Referential integrity & foreign keysManage foreign keysKey generatorsValidate and syncValidate early, then resync whenever the schema drifts.Validate and synchronize workspaceTune generation settingsOptimize performance and reduce write errors before scaling up.View and adjust generation settingsLarge workloads

Last updated

Was this helpful?