Table relationships
Last updated
Last updated
When using Syntho's AI-powered synthetic data generation, the best possible data utility requiring the least amount of resources, it is best practice to prepare your data as a single entity table. If you must generate data for multiple tables, however, Syntho offers three options:
Synthesize individual tables with automatic key matching: To ensure hardware resources stay within reasonable limits, by default, Syntho synthesizes each table separately from another, and afterwards generates new keys for each table. This method doesn't maintain inherent relationships between tables (i.e. relationships between key and non-key columns). For example, a Pregnancy diagnosis in the synthetic Diagnosis table could point to a Male patient in the synthetic Patients table. Nonetheless, it upholds technical referential integrity by generating new keys, ensuring each foreign key corresponds to an existing primary key in another table. If you must preserve cross-table relationships, you have three options: convert the relevant information from the Diagnosis table and the Patients table into a single entity table and then synthesize, synthesize using Syntho's sequence model (up to 2 tables), or apply PII de-identification (unlimited tables).
Synthesize using sequence model: If you want to preserve cross-table relationships between 2 related tables, where you also preserve relationships between key and non-key columns, you can use Syntho’s synthetic data sequence model. This Syntho feature is especially valuable if you want to synthesize sequence data (e.g., time series or trajectories).
PII de-identification: Other than synthesization, the Syntho platform can be used to de-identify your PII columns with the help of the Syntho PII scanner and Syntho mockers and leave all remaining columns in-tact. This approach has the benefit of preserving cross-table relationships and is most popular for use cases related to testing & development.
See below a summary of the key approaches Syntho offers to preserve table relationships.
Approach | Cross-table relationships | Referential integrity | Upsampling | Preserve sequence information | Table limit |
---|---|---|---|---|---|
Here are the articles for your reference:
Synthesize individual tables with automatic key matching
Unlimited (without preserving cross-table relations)
Synthesize using sequence model
2
PII de-identification
Unlimited
Key generators
Verify foreign keys