Configure table settings

On the Job configuration panel, you can apply several configurations on the column level.

Table modes

Table modes enable you to control how tables in the source database are processed and transferred to the destination database. The three modes available are: Synthesize, De-identify, and Exclude.

Overview of table modes

There are three table modes that you can use to define how each table from the source database is handled:

  • Synthesize: In this mode, tables from the source database are synthesized using Syntho's AI and then written to the destination database.

When a table is listed under the Synthesize mode, the default applied column mode is AI-powered generation. However, you have the option to change this default setting.\

Steps to Change Column Mode in Synthesize:

  1. Select the table listed under "Synthesize".

  2. Access the column settings for the selected table.

  3. By default, the column mode is set to AI-powered generation.

  4. Change the column mode to one of the following options:

    • Mocker: Use this option to fill the columns with mock data.

    • Duplicate: Select this if you want the columns to be an exact copy from the source.

    • Exclude: Choose this option if you don’t want to include specific columns in the synthesized table.

For more information, see Configure column settings.

  • De-identify: When tables are set to this mode, they are directly copied with or without any alteration from the source to the destination database per user request.

When a table is listed under the De-identify mode, the default applied column mode is also Duplicate. This means that the columns will be exactly copied from the source. However, you can change this setting.

Steps to Change Column Mode in Duplicate:

  1. Select the table listed under "Duplicate".

  2. Access the column settings for the selected table.

  3. By default, the column mode is set to Duplicate.

  4. Change the column mode to one of the following options:

    • Mocker: Use this option to fill the columns with mock data.

    • Exclude: Choose this option if you don’t want to include specific columns in the duplicated table.

For more information, see Configure column settings.

  • Exclude: Tables assigned to this mode will not be transferred to the destination database.

Caution: Excluding a table could cause conflicts with foreign key constraints in your destination database.

The Table Mode Menu is located on the left panel of the Syntho interface.

  • To assign a table to a specific mode, simply drag the table from the list and drop it under the desired Table Mode (Synthesize, Duplicate, or Exclude) in the Table Mode Menu.

Hint: You can drag multiple tables simultaneously by holdingCTRL or SHIFTand then selecting and dragging the tables.

Automatic Table Mode Assignment

After creating your workspace, Syntho automatically assigns your table to one of the table modes based on the number of rows in the source table. It's important to verify if the automatically assigned Table Mode for each table is appropriate for your use case:

  • Review the Table Mode assigned to each table in the Table Mode Menu.

  • If you find that a table should be in a different mode, you can simply drag and drop the table to the desired mode as explained in section 2.

  • Make sure to double-check and ensure that all tables are in the correct mode before proceeding further.

Adjust the number of rows to generate

By default, Syntho generates the same number of rows in the destination table as in your source table.

To change the number of rows to generate for a table:

  1. Go to Rows to generate field in the Table settings menu right on the Job settings panel.

  2. Update the field value to the desired number of destination rows.

The behaviour when adjusting the destination table row count is the following:

  • For tables marked as Synthesize, generates the number of specified rows using Syntho's AI and any applied mockers.

  • For tables marked as Duplicate, generates the number of specified rows (n) by taking random samples from the original table (n_original). If n <= n_original, then original rows are copied. If n > n_original, then original n rows are copied as-is, and subsequent rows are randomly sampled (with replacement) from the original rows.

  • For tables marked as Exclude, does not generate any rows (since the table is excluded).

Considerations for adjusting the number of rows to generate

  • The Rows to generate field will be disabled if the table doesn't support oversampling, which can be due to the following:

    • The table is using another table mode than Synthesize.

    • The table has another method than Generate as the applied key generator method.

  • If its number was previously changed and the table doesn't support oversampling anymore, the value will be reverted to the original one.

  • Adjusting Rows to generate could cause conflicts with foreign key constraints in your destination database.

Pagination

A "Load More" button allows users to load additional data on-demand, preventing delays caused by loading all data at once.

"Load More" button

Advanced Table settings

Unfold Advanced settings under the Table settings to view and adjust settings on the table-level. Note that these settings will only be relevant for any columns that use AI-powered generation.

You can adjust the following advanced table settings:

  1. Maximum rows used for training: The maximum number of rows to be used for training. Using fewer rows can speed up the process, but may come at the cost of lower synthetic data utility.

  2. Take random sample:

    • On: takes a random sample of rows used for training. Note that choosing this option can cause a data generation job to run significantly longer, depending on the database.

    • Off (default): takes the top rows as defined in the database.

  1. Choose table model: The generative AI model that will be used for all columns with AI-powered generation applied.

ORDER BY

Hive only

In the Table Settings panel, a dropdown field allows users to specify which columns should be used in the "ORDER BY" clause. This feature enables users to define a set of columns that ensure the uniqueness of the returned results for a given table. By selecting the appropriate columns, users can achieve deterministic ordering even in the absence of primary keys or indexes.

  • Order By Dropdown: Located in the Table Settings panel on the right side of the Table/Job Configuration screen, this dropdown lets users choose the columns for the "ORDER BY" clause.

Steps to Configure:

  1. Open the Table Settings panel in the Job Configuration screen.

  2. Scroll to find the "ORDER BY" dropdown.

  3. Select the desired columns from the dropdown to define the order.

Example Scenario:

  • If a table does not have a primary key or index, and the first column contains duplicates, the application may not order the data consistently. By using the new "ORDER BY" dropdown, users can select a combination of columns (e.g., ColumnA, ColumnB, ColumnC) that together provide a unique ordering for the table.

To improve the user experience when loading application screens and panels, Syntho has efficient data loading mechanisms. These aim to ensure smoother interaction, especially when the source database contains a significant amount of data.

Last updated