Configure table settings
On the Job configuration panel, you can apply several configurations on the column level.
Include: Tables assigned to this mode will be transferred to the destination database.
Exclude: Tables assigned to this mode will not be transferred to the destination database.
Caution: Excluding a table could cause conflicts with foreign key constraints in your destination database.
Hint: You can drag multiple tables simultaneously by holdingCTRL
or SHIFT
and then selecting and dragging the tables.
Adjust the number of rows to generate
By default, Syntho generates the same number of rows in the destination table as in your source table.
To change the number of rows to generate for a table:
Go to Rows to generate field in the Table settings menu right on the Job settings panel.
Update the field value to the desired number of destination rows.
The behaviour when adjusting the destination table row count is the following:
For tables marked as Synthesize, generates the number of specified rows using Syntho's AI and any applied mockers.
For tables marked as Duplicate, generates the number of specified rows (
n
) by taking random samples from the original table (n_original
). Ifn <= n_original
, then original rows are copied. Ifn > n_original
, then original n rows are copied as-is, and subsequent rows are randomly sampled (with replacement) from the original rows.For tables marked as Exclude, does not generate any rows (since the table is excluded).
Considerations for adjusting the number of rows to generate
The Rows to generate field will be disabled if the table doesn't support oversampling, which can be due to the following:
The table has another method than Generate as the applied key generator method.
If its number was previously changed and the table doesn't support oversampling anymore, the value will be reverted to the original one.
Adjusting Rows to generate could cause conflicts with foreign key constraints in your destination database.
Advanced Table settings
Unfold Advanced settings under the Table settings to view and adjust settings on the table-level. Note that these settings will only be relevant for any columns that use AI-powered generation.
You can adjust the following advanced table settings:
Maximum rows used for training: The maximum number of rows to be used for training. Using fewer rows can speed up the process, but may come at the cost of lower synthetic data utility.
Take random sample:
On: takes a random sample of rows used for training. Note that choosing this option can cause a data generation job to run significantly longer, depending on the database.
Off (default): takes the top rows as defined in the database.
Choose Table Model: The generative AI model that will be applied to all columns using AI-powered generation. This feature allows users to flexibly manage multiple table models by selecting between the following options:
Single Table Model
Sequence Table Model
Please note that you can create multiple sequence models as long as the foreign key (FK) relationship limit between the tables is present.
ORDER BY
Hive only
In the Table Settings panel, a dropdown field allows users to specify which columns should be used in the "ORDER BY" clause. This feature enables users to define a set of columns that ensure the uniqueness of the returned results for a given table. By selecting the appropriate columns, users can achieve deterministic ordering even in the absence of primary keys or indexes.
Order By Dropdown: Located in the Table Settings panel on the right side of the Table/Job Configuration screen, this dropdown lets users choose the columns for the "ORDER BY" clause.
Steps to Configure:
Open the Table Settings panel in the Job Configuration screen.
Scroll to find the "ORDER BY" dropdown.
Select the desired columns from the dropdown to define the order.
Example Scenario:
If a table does not have a primary key or index, and the first column contains duplicates, the application may not order the data consistently. By using the new "ORDER BY" dropdown, users can select a combination of columns (e.g.,
ColumnA, ColumnB, ColumnC
) that together provide a unique ordering for the table.
To improve the user experience when loading application screens and panels, Syntho has efficient data loading mechanisms. These aim to ensure smoother interaction, especially when the source database contains a significant amount of data.
Last updated