2. Introduction to data generators
Syntho offers a flexible set of data generators that help anonymize sensitive data based on the nature of the dataset, privacy requirements, and use case. Below is a summary of the main generator types and when to use each.
Trains a generative model to create synthetic rows that mimic the original dataset, without any one-to-one relation. &#xNAN;Use when: you need statistical fidelity and privacy, e.g. for machine learning or testing large datasets. &#xNAN;Avoid when: you need to preserve correlations and data consistency across related tables.
Generate fully random, user-defined values. &#xNAN;Use when: format matters, but relationship to original values is not important. &#xNAN;Avoid when: consistency or referential integrity is needed.
Maps original values to consistent mock values. &#xNAN;Use when: consistent replacement of values is needed across datasets or environments. &#xNAN;Avoid when: randomness is more important than consistency.
Directly modifies original values while preserving format. &#xNAN;Use when: the output must remain in a recognizable or valid format. &#xNAN;Avoid when: preserving exact values or reversibility is required.
Uses business logic to generate values. &#xNAN;Use when: you need calculated outputs based on specific conditions. &#xNAN;Avoid when: data generation is simple and preserving logic is not required.
Create or transform keys while maintaining or removing relational links. &#xNAN;Use when: managing primary and foreign keys across tables. &#xNAN;Avoid when: relationships are not needed.
Last updated
Was this helpful?

