Use Case 3: Demo data
Realistic demo data that contains no real identifiers and can be shared safely.
Use this use case when you need realistic demo data without sensitive information. The focus is stable, shareable demo scenarios.
What problem this use case solves
Sales and pre-sales teams need high-quality datasets to demonstrate workflows. Data must be safe to share in demo environments.
Classic anonymization can degrade the demo experience. It also requires careful handling of indirect identifiers.
When to choose this use case
Pick this when you need shareable demo data that still “feels real”.
If you’re unsure, start with Mock or mask all and only enable Consistent mapping for a small set of key entities.
You demo workflows and UI journeys with realistic values.
You need zero real identifiers in the demo environment.
You need stable “hero customers” across tables and refreshes.
You need realistic formats for validators and UI flows.
Use Mock for narrative fields.
Use Mask for format-critical fields.
When to avoid this use case
Skip this when demos are not the goal.
You need external sharing with documented privacy evidence and low linkability. Use Use Case 9: Data Sharing & Monetization.
You need DTAP-style test data for apps and APIs (not demo narratives). Use Use Case 1: Application & API Testing.
You need upsampling or load generation at scale. Use Use Case 2: Load & Stress.
Recommended Syntho configuration
This setup is optimized for stable demo narratives. You want repeatable “storytelling” values. You want zero real identifiers, even indirectly.
Prerequisites
Use the Prerequisites checklist.
Checklist
Treat demo data as shareable by default. Avoid any real identifiers.
Source & destination management
Create one workspace per demo dataset or audience. Examples: sales-basic, partner-demo, industry-demo.
Duplicate a working workspace before big changes. This gives you a rollback point.
Use simple versioned names like
v1,v2,baseline, orpilot-partner-x.
Baseline rules
Keep the source stable. Prefer snapshots or back-ups.
Avoid a live production source for iterative work.
Keep the destination isolated. Never write into production.
Keep schemas aligned between source, workspace and destination.
Use views when you need only a subset of the original database.
Lifecycle rule of thumb
Keep the source connection when you expect schema changes.
Remove the source connection when you expect a new run only much later.
Revalidate after schema changes. Use Validate and synchronize workspace.
Nuances for this use case
Use a stable seed when you can. It makes storytelling predictable.
Treat demos as shareable by default.
Keep “hero entities” stable intentionally. Use consistent mapping sparingly.
Don’t generate into a schema that already contains old demo data. You may get inconsistent data across sessions.
Configure generators
Workspace initialization mode
Choose a workspace mode. It applies baseline generator suggestions during workspace creation.
Recommended modes for this use case:
Mock or mask all when you want “looks real” data with zero identifiers.
Mock all when you have little to no source data and you’re building a demo dataset from scratch.
De-identify when you start from a production-like schema and want to keep non-PII columns mostly intact.
AI-generated synthesis
Use this when your demo must feel statistically realistic (distributions, correlations). It’s useful when you have an single entity table or demo view.
Example (realistic churn patterns): create a demo_customer_features_view (tenure, orders, segment), then use AI synthesize so churn-related features look realistic while still being unlinkable.
Rule-based generation
Use this when a demo needs guaranteed scenarios. Use Calculated columns to enforce “must exist” states.
Example (always-have VIP customers): create a tier that drives demo flows (pricing, permissions, entitlements).
Masking
Use this when UI flows require valid formats (email, phone, IBAN) and you want stable data across tables.
Example (stable hero customer): mask email and phone_number to valid formats, then enable Consistent mapping only for customer_name and company_name so the same “ACME” appears across orders and invoices.
Hybrid
Use this when you need both narrative control and realistic variation.
It maps well to the “deterministic relations” and “absolute calculations” patterns in Example data generation scenarios.
Example (order totals and margin always consistent): if your demo shows “margin” or “savings”, make the math hold.
AI synthesize or mock
order_totalandfulfillment_cost.Use a calculated column for
gross_marginso it’s always correct.
Minimal configuration steps
Run a PII scan.
Use Mock for narrative fields. Use Mask for validator formats.
Enable consistent mapping only for the direct identifiers.
Use consistent mapping sparingly. It helps storytelling, but increases linkability.
Concrete example: a “customer journey” demo narrative
Goal: a dataset that always has a few customers in each lifecycle stage so demo flows never get stuck.
Example workspace name: sales-basic.
Example “story columns” you can enforce with calculated columns:
Tip: only apply consistent mapping to the “hero entities” you showcase (top accounts, key products). This keeps the story stable while limiting linkability.
Handle keys and relationships (relational schemas)
If your demo is single-table (or you share one flattened table), you can skip this step.
Demo flows break on missing relationships. Validate FKs before polishing the data.
Use Manage foreign keys and add virtual foreign keys if the source schema is incomplete.
Validate and sync
Run the demo journeys against the generated dataset. Fix the exact tables and columns that break flows.
Resync whenever the product schema changes. Use Validate and synchronize workspace. Demos go stale fast without validation.
Tune generation settings
Prioritize fast reset times. Demo environments get rebuilt often.
Use View and adjust generation settings once the storytelling rules are stable.
Common pitfalls & misconfigurations
Use case-specific pitfalls
Treating demo datasets as “internal-only” and skipping privacy review steps.
Applying consistent mapping broadly instead of only on direct identifiers
General pitfalls
These pitfalls show up in most projects:
Running full-scale jobs before a small validation run.
Skipping workspace validation/sync after schema changes. Use Validate and synchronize workspace.
Breaking relational integrity (missing PK/FK setup, missing foreign keys, missing virtual foreign keys). Start with Manage foreign keys and virtual foreign keys.
Overusing Consistent mapping (it slows down data generation and increases linkability).
Governance, compliance, and automation
Use case-specific recommendations
Maintain a list of approved tables/fields. Don’t add columns ad-hoc right before a demo.
Separate internal sales demos from partner demos. Use separate workspaces and destinations to prevent accidental sharing.
Automate a pre-demo reset: regenerate → run 2–3 scripted demo journeys → confirm no PII columns remain on Duplicate.
General recommendations
Use these recommendations for most workspaces.
Ownership and change control
Assign a single workspace owner (data steward / privacy lead / DBA).
Require a ticket or change request for generator changes.
Duplicate a workspace before large edits. Keep the previous version as rollback.
Access control
Default to read-only access for source connections.
Restrict who can view source data in the UI.
Use separate workspaces per environment or audience.
Automation (baseline)
Use the Syntho REST API to standardize scans and runs.
Automate data generation not workspace configuration.
Keep job logs for failed runs. This reduces back-and-forth during support.
Last updated
Was this helpful?

