LogoLogo
Go to Syntho.AI
English
English
  • Welcome to Syntho
  • Overview
    • Get started
      • Syntho bootcamp
        • 1. What is Syntho?
        • 2. Introduction data anonymization
        • 3. Connectors & workspace creation
        • 4. PII scan
        • 5. Generators
          • Mockers
          • Maskers
          • AI synthesize
          • Calculated columns
          • Free text de-identification
        • 6. Referential integrity & foreign keys
        • 7. Workspace synchronization & validation
        • 8. Workspace & user management
        • 9. Large workloads​
        • 10. AI synthesis: Data pre-processing when using
      • Prerequisites
      • Sample datasets
      • Introduction to data generators
      • AI-generated synthetic data
    • Frequently asked questions
  • Setup Workspaces
    • View workspaces
    • Create a workspace
      • Connect to a database
        • PostgreSQL
        • MySQL / MariaDB
        • Oracle
        • Microsoft SQL Server
        • DB2
        • Databricks
          • Importing Data into Databricks
        • Hive
        • SAP Sybase
        • Azure Data Lake Storage (ADLS)
        • Amazon Simple Storage Service (S3)
      • Workspace modes
    • Edit a workspace
    • Duplicate a workspace
    • Transfer workspace ownership
    • Share a workspace
    • Delete a workspace
    • Workspace default settings
  • Configure a Data Generation Job
    • Configure table settings
    • Configure column settings
      • AI synthesize
        • Sequence model
          • Prepare your sequence data
        • QA report
        • Additional privacy controls
        • Cross-table relationships limitations
      • Mock
        • Text
          • Supported languages
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • Other
      • Mask
        • Text
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • UUID
      • Duplicate
      • Exclude
      • Consistent mapping
      • Calculated columns
      • Key generators
        • Differences between key generators
      • JSON de-identification
    • Manage personally identifiable information (PII)
      • Privacy dashboard
      • Discover and de-identify PII columns
        • Identify PII columns manually
        • Automatic PII discovery with PII scanner
      • Remove columns from PII list
      • Automatic PII discovery and de-identification in free text columns
      • Supported PII & PHI entities
    • Manage foreign keys
      • Foreign key inheritance
      • Add virtual foreign keys
        • Add virtual foreign keys
        • Use foreign key scanner
        • Import foreign keys via JSON
        • Export foreign keys via JSON
      • Delete foreign keys
    • Validate and synchronize workspace
    • View and adjust generation settings
  • Deploy Syntho
    • Introduction
      • Syntho architecture
      • Requirements
        • Requirements for Docker deployments
        • Requirements for Kubernetes deployments
      • Access Docker images
        • Online
        • Offline
    • Deploy Syntho using Docker
      • Preparations
      • Deploy using Docker Compose
      • Run the application
      • Manually saving logs
      • Updating the application
      • Backup
    • Deploy Syntho using Kubernetes
      • Preparations
      • Deploy Ray using Helm
        • Upgrading Ray CRDs
        • Troubleshooting
      • Deploy Syntho using Helm
      • Validate the deployment
      • Troubleshooting
      • Saving logs
      • Upgrading the applications
      • Backup
    • Manage users and access
      • Single Sign-On (SSO) in Azure
      • Manage admin users
      • Manage non-admin users
    • Logs and monitoring
      • Does Syntho collect any data?
      • Temporary data storage by application
  • Syntho API
    • Syntho REST API
Powered by GitBook
On this page
  • View subsetting configuration
  • Configure target table
  • Configure linked tables
  • Configure duplicated tables
  • Configure excluded tables
  • Launch subsetting job

Was this helpful?

  1. Subsetting

Configure subsetting

Coming Soon

Syntho's improved subsetting feature will be re-introduced in the platform soon.

On the Job configuration panel, click on Switch to Subsetting -> to go to the Subsetting panel.

View subsetting configuration

On the Subsetting panel, you can view the subsetting configuration in the table overview.

  • The Table name column shows all tables that are part of the source database.

  • The Include column indicates whether the table will be included in the subset.

  • The Row count column indicates, by estimate, how many rows will be generated for the subset.

You can quickly filter tables by selecting the dropdown icons next to the column headers.

By default, all tables are marked as Exclude, indicating they are excluded from the subsetting configuration.

You can adjust the table mode by selecting a row in the table overview.

Configure target table

Select Subsetting mode > Target to configure the table as Target table.

You can specify a percentage of the target table to include. This percentage is converted into a filter or WHERE clause depending on the database type.

It is possible that the target table in the subset contains more rows than the specified percentage in order to maintain referential integrity.

Note: To preserve referential integrity, at most one target table can be set in. When updating the targe table, all linked tables will be updated accordingly.

Configure linked tables

Once a target table has been selected, all linked tables will automatically be marked as Linked in the subsetting configuration.

Select Subsetting mode > Target | Duplicate | Exclude to change the Subsetting mode for any linked table.

Select Subsetting mode > Linked to reconfigure a table as Linked table. Note that this is only possible if a target table has been defined that is directly or indirectly related to the table.

Configure duplicated tables

Select Subsetting mode > Duplicate to copy, i.e. duplicate, the full table to the subset.

Configure excluded tables

By default, tables that are not defined as target tables, linked tables, or duplicated tables, will be automatically set to Exclude.

Launch subsetting job

When you select Generate from the Job configuration panel, Syntho performs several validation steps against the schema of the source database and the destination database.

At last, select Start generating to launch your subsetting job.

Last updated 6 months ago

Was this helpful?

You cannot manually exclude a table from the subsetting configuration. If you want to exclude a table from a data generation job, you can do this using the appropriate .

table mode