# Use Case 3: Demo data

Use this use case when you need realistic demo data without sensitive information. The focus is stable, shareable demo scenarios.

### What problem this use case solves

Sales and pre-sales teams need high-quality datasets to demonstrate workflows. Data must be safe to share in demo environments.

Classic anonymization can degrade the demo experience. It also requires careful handling of indirect identifiers.

### When to choose this use case

Pick this when you need shareable demo data that still “feels real”.

If you’re unsure, start with **Mock or mask all** and only enable [Consistent mapping](/configure-a-data-generation-job/configure-column-settings/consistent-mapping.md) for a small set of key entities.

* You demo workflows and UI journeys with realistic values.
* You need zero real identifiers in the demo environment.
* You need stable “hero customers” across tables and refreshes.
* You need realistic formats for validators and UI flows.
* Use [Mock](/configure-a-data-generation-job/configure-column-settings/mockers.md) for narrative fields.
* Use [Mask](/configure-a-data-generation-job/configure-column-settings/mask.md) for format-critical fields.

### When to avoid this use case

Skip this when demos are not the goal.

* You need external sharing with documented privacy evidence and low linkability. Use [Use Case 9: Data Sharing & Monetization](/overview/get-started/use-cases-and-configuration/use-case-9-data-sharing-and-monetization.md).
* You need DTAP-style test data for apps and APIs (not demo narratives). Use [Use Case 1: Application & API Testing](/overview/get-started/use-cases-and-configuration/use-case-1-application-and-api-testing.md).
* You need upsampling or load generation at scale. Use [Use Case 2: Load & Stress](/overview/get-started/use-cases-and-configuration/use-case-2-load-and-stress-testing.md).

### Recommended Syntho configuration

This setup is optimized for **stable demo narratives**. You want repeatable “storytelling” values. You want zero real identifiers, even indirectly.

{% stepper %}
{% step %}

#### Prerequisites

* Use the [Prerequisites](/overview/get-started/prerequisites.md) checklist.

**Checklist**

* [ ] Demo journeys listed (happy path + 1–2 alternates).
* [ ] “Hero entities” defined (small, intentional set).
* [ ] Destination is isolated and easy to reset.

{% hint style="warning" %}
Treat demo data as shareable by default. Avoid any real identifiers.
{% endhint %}
{% endstep %}

{% step %}

#### Source & destination management

Create one workspace per demo dataset or audience. Examples: `sales-basic`, `partner-demo`, `industry-demo`.

* Duplicate a working workspace before big changes. This gives you a rollback point.
* Use simple versioned names like `v1`, `v2`, `baseline`, or `pilot-partner-x`.

#### Baseline rules

* Keep the **source stable**. Prefer snapshots or back-ups.
* Avoid a **live production** source for iterative work.
* Keep the **destination isolated**. Never write into production.
* Keep **schemas aligned** between source, workspace and destination.
* Use **views** when you need only a subset of the original database.

#### Lifecycle rule of thumb

* Keep the source connection when you expect schema changes.
* Remove the source connection when you expect a new run only much later.
* Revalidate after schema changes. Use [Validate and synchronize workspace](/configure-a-data-generation-job/generation-and-validation/validate-and-synchronize-workspace.md).

**Nuances for this use case**

* Use a stable seed when you can. It makes storytelling predictable.
* Treat demos as shareable by default.
* Keep “hero entities” stable intentionally. Use consistent mapping sparingly.
* Don’t generate into a schema that already contains old demo data. You may get inconsistent data across sessions.
  {% endstep %}

{% step %}

#### Configure generators

**Workspace initialization mode**

Choose a [workspace mode](/setup-workspaces/create-a-workspace/workspace-modes.md). It applies baseline generator suggestions during workspace creation.

Recommended modes for this use case:

* **Mock or mask all** when you want “looks real” data with zero identifiers.
* **Mock all** when you have little to no source data and you’re building a demo dataset from scratch.
* **De-identify** when you start from a production-like schema and want to keep non-PII columns mostly intact.

**AI-generated synthesis**

Use this when your demo must feel **statistically realistic** (distributions, correlations). It’s useful when you have an single entity table or demo view.

**Example (realistic churn patterns):** create a `demo_customer_features_view` (tenure, orders, segment), then use **AI synthesize** so churn-related features look realistic while still being unlinkable.

**Rule-based generation**

Use this when a demo needs **guaranteed scenarios**. Use [Calculated columns](/configure-a-data-generation-job/configure-column-settings/calculated-columns.md) to enforce “must exist” states.

**Example (always-have VIP customers):** create a tier that drives demo flows (pricing, permissions, entitlements).

```excel-formula
// New column: demo_tier (simple, stable ratios)
SWITCH(TRUE,
  RAND() < 0.05, "VIP",
  RAND() < 0.25, "PREMIUM",
  "STANDARD"
)
```

**Masking**

Use this when UI flows require **valid formats** (email, phone, IBAN) and you want stable data across tables.

**Example (stable hero customer):** mask `email` and `phone_number` to valid formats, then enable **Consistent mapping** only for `customer_name` and `company_name` so the same “ACME” appears across orders and invoices.

**Hybrid**

Use this when you need both **narrative control** and **realistic variation**.

It maps well to the “deterministic relations” and “absolute calculations” patterns in [Example data generation scenarios](/overview/get-started/syntho-bootcamp/example-data-generation-scenarios.md).

**Example (order totals and margin always consistent):** if your demo shows “margin” or “savings”, make the math hold.

1. AI synthesize or mock `order_total` and `fulfillment_cost`.
2. Use a calculated column for `gross_margin` so it’s always correct.

```excel-formula
// New column: gross_margin (absolute calculation)
[order_total] - [fulfillment_cost]
```

**Minimal configuration steps**

1. Run a [PII scan](/configure-a-data-generation-job/privacy-dashboard/automatic-pii-discovery-with-pii-scanner.md).
2. Use **Mock** for narrative fields. Use **Mask** for validator formats.
3. Enable consistent mapping only for the direct identifiers.

{% hint style="info" %}
Use consistent mapping sparingly. It helps storytelling, but increases linkability.
{% endhint %}

<details>

<summary>Concrete example: a “customer journey” demo narrative</summary>

Goal: a dataset that always has a few customers in each lifecycle stage so demo flows never get stuck.

Example workspace name: `sales-basic`.

Example “story columns” you can enforce with calculated columns:

```excel-formula
// New column: journey_stage
IFS(
  [last_order_date] >= DATEADD(TODAY(), -30, "day"),  "ACTIVE",
  [last_order_date] >= DATEADD(TODAY(), -180, "day"), "AT_RISK",
  TRUE,                                               "CHURNED"
)
```

```excel-formula
// New column: is_premium (ensure some premium customers exist)
IFS(
  [journey_stage] = "ACTIVE",  RAND() < 0.20,
  [journey_stage] = "AT_RISK", RAND() < 0.10,
  TRUE,                        RAND() < 0.05
)
```

Tip: only apply consistent mapping to the “hero entities” you showcase (top accounts, key products). This keeps the story stable while limiting linkability.

</details>
{% endstep %}

{% step %}

#### Handle keys and relationships (relational schemas)

If your demo is **single-table** (or you share one flattened table), you can skip this step.

Demo flows break on missing relationships. Validate FKs before polishing the data.

Use [Manage foreign keys](/configure-a-data-generation-job/manage-foreign-keys.md) and add [virtual foreign keys](/configure-a-data-generation-job/manage-foreign-keys/add-virtual-foreign-keys/add-virtual-foreign-keys.md) if the source schema is incomplete.

* [Key generators](/configure-a-data-generation-job/configure-column-settings/key-generators.md)
  {% endstep %}

{% step %}

#### Validate and sync

Run the demo journeys against the generated dataset. Fix the exact tables and columns that break flows.

Resync whenever the product schema changes. Use [Validate and synchronize workspace](/configure-a-data-generation-job/generation-and-validation/validate-and-synchronize-workspace.md). Demos go stale fast without validation.

* [Validate and synchronize workspace](/configure-a-data-generation-job/generation-and-validation/validate-and-synchronize-workspace.md)
  {% endstep %}

{% step %}

#### Tune generation settings

Prioritize fast reset times. Demo environments get rebuilt often.

Use [View and adjust generation settings](/configure-a-data-generation-job/generation-and-validation/view-and-adjust-generation-settings.md) once the storytelling rules are stable.

* [View and adjust generation settings](/configure-a-data-generation-job/generation-and-validation/view-and-adjust-generation-settings.md)
  {% endstep %}
  {% endstepper %}

### Common pitfalls & misconfigurations

#### Use case-specific pitfalls

* Treating demo datasets as “internal-only” and skipping privacy review steps.
* Applying consistent mapping broadly instead of only on direct identifiers

<details>

<summary>General pitfalls</summary>

These pitfalls show up in most projects:

* Running full-scale jobs before a small validation run.
* Skipping workspace validation/sync after schema changes. Use [Validate and synchronize workspace](/configure-a-data-generation-job/generation-and-validation/validate-and-synchronize-workspace.md).
* Breaking relational integrity (missing PK/FK setup, missing foreign keys, missing virtual foreign keys). Start with [Manage foreign keys](/configure-a-data-generation-job/manage-foreign-keys.md) and [virtual foreign keys](/configure-a-data-generation-job/manage-foreign-keys/add-virtual-foreign-keys/add-virtual-foreign-keys.md).
* Leaving sensitive columns on [**Duplicate**](/configure-a-data-generation-job/configure-column-settings/duplicate.md), or trusting the [PII scan](/configure-a-data-generation-job/privacy-dashboard/automatic-pii-discovery-with-pii-scanner.md) without reviewing false positives/negatives.
* Overusing [**Consistent mapping**](/configure-a-data-generation-job/configure-column-settings/consistent-mapping.md) (it slows down data generation and increases linkability).

</details>

### Governance, compliance, and automation

#### Use case-specific recommendations

* Maintain a list of approved tables/fields. Don’t add columns ad-hoc right before a demo.
* Separate internal sales demos from partner demos. Use separate workspaces and destinations to prevent accidental sharing.
* Automate a pre-demo reset: regenerate → run 2–3 scripted demo journeys → confirm no PII columns remain on **Duplicate**.

<details>

<summary>General recommendations</summary>

Use these recommendations for most workspaces.

#### Ownership and change control

* Assign a single **workspace owner** (data steward / privacy lead / DBA).
* Require a ticket or change request for generator changes.
* Duplicate a workspace before large edits. Keep the previous version as rollback.

#### Access control

* Default to **read-only** access for source connections.
* Restrict **who can view source data** in the UI.
* Use separate workspaces per environment or audience.

#### Automation (baseline)

* Use the [Syntho REST API](/syntho-api/syntho-rest-api.md) to standardize scans and runs.
* Automate data generation not workspace configuration.
* Keep job logs for failed runs. This reduces back-and-forth during support.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.syntho.ai/overview/get-started/use-cases-and-configuration/use-case-3-demo-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
