LogoLogo
Go to Syntho.AI
English
English
  • Welcome to Syntho
  • Overview
    • Get started
      • Syntho bootcamp
        • 1. What is Syntho?
        • 2. Introduction data anonymization
        • 3. Connectors & workspace creation
        • 4. PII scan
        • 5. Generators
          • Mockers
          • Maskers
          • AI synthesize
          • Calculated columns
          • Free text de-identification
        • 6. Referential integrity & foreign keys
        • 7. Workspace synchronization & validation
        • 8. Workspace & user management
        • 9. Large workloads​
        • 10. Data pre-processing
        • 11. Continuous Success
      • Prerequisites
      • Sample datasets
      • Introduction to data generators
    • Frequently asked questions
  • Setup Workspaces
    • View workspaces
    • Create a workspace
      • Connect to a database
        • PostgreSQL
        • MySQL / MariaDB
        • Oracle
        • Microsoft SQL Server
        • DB2
        • Databricks
          • Importing Data into Databricks
        • Hive
        • SAP Sybase
        • Azure Data Lake Storage (ADLS)
        • Amazon Simple Storage Service (S3)
      • Workspace modes
    • Edit a workspace
    • Duplicate a workspace
    • Transfer workspace ownership
    • Share a workspace
    • Delete a workspace
    • Workspace default settings
  • Configure a Data Generation Job
    • Configure table settings
    • Configure column settings
      • AI synthesize
        • Sequence model
          • Prepare your sequence data
        • QA report
        • Additional privacy controls
        • Cross-table relationships limitations
      • Mockers
        • Text
          • Supported languages
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • Other
      • Mask
        • Text
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • UUID
      • Duplicate
      • Exclude
      • Consistent mapping
      • Calculated columns
      • Key generators
        • Differences between key generators
      • JSON de-identification
    • Manage personally identifiable information (PII)
      • Privacy dashboard
      • Discover and de-identify PII columns
        • Identify PII columns manually
        • Automatic PII discovery with PII scanner
      • Remove columns from PII list
      • Automatic PII discovery and de-identification in free text columns
      • Supported PII & PHI entities
    • Manage foreign keys
      • Foreign key inheritance
      • Add virtual foreign keys
        • Add virtual foreign keys
        • Use foreign key scanner
        • Import foreign keys via JSON
        • Export foreign keys via JSON
      • Delete foreign keys
    • Validate and Synchronize workspace
    • View and adjust generation settings
  • Deploy Syntho
    • Introduction
      • Syntho architecture
      • Requirements
        • Requirements for Docker deployments
        • Requirements for Kubernetes deployments
      • Access Docker images
        • Online
        • Offline
    • Deploy Syntho using Docker
      • Preparations
      • Deploy using Docker Compose
      • Run the application
      • Manually saving logs
      • Updating the application
    • Deploy Syntho using Kubernetes
      • Preparations
      • Deploy Ray using Helm
        • Upgrading Ray CRDs
        • Troubleshooting
      • Deploy Syntho using Helm
      • Validate the deployment
      • Troubleshooting
      • Saving logs
      • Upgrading the applications
    • Manage users and access
      • Single Sign-On (SSO) in Azure
      • Manage admin users
      • Manage non-admin users
    • Logs and monitoring
      • Does Syntho collect any data?
      • Temporary data storage by application
  • Syntho API
    • Syntho REST API
Powered by GitBook
On this page
  • Filter foreign key candidates
  • Limitations of the foreign key scanner

Was this helpful?

  1. Configure a Data Generation Job
  2. Manage foreign keys
  3. Add virtual foreign keys

Use foreign key scanner

You can use the foreign key scanner to make it easier to discover virtual foreign keys.

  1. Select the Foreign keys tab on the Job configuration screen.

  2. Press the Scan button to launch a foreign key scan.

  3. Select any filters to apply to limit the amount of foreign key candidates.

  4. Finally, select Confirm to launch your foreign key scan.

Once the foreign key scan is complete, you can view, confirm, or delete any foreign key candidates resulting from the scan.

Filter foreign key candidates

To confirm starting a foreign key scan, you can enable or disable filters to limit the foreign key candidates - columns that are considered as possible foreign key options - for the scan.

  • Only include exact column name matches: When enabled, only column pairs with the exact same column names are considered foreign key candidates. When disabled, column names are not considered to limit the possible foreign key candidates.

  • Only include exact data type matches: When enabled, only column pairs with the exact same data types are considered foreign key candidates. When disabled, two columns must still have a compatible data type, but it does not need to be an exact match.

  • FK candidates must link to an existing primary key: When enabled, foreign key candidates must always link to an existing primary key column. When disabled, foreign key candidates can also be columns that are not defined as primary keys in the database, but can be identified as primary keys based on Syntho logic, considering the cardinality of the columns.

    • Include string values as primary key candidates: When enabled, columns with a data type that matches a String type (e.g. VARCHAR and TEXT) are included as possible candidates. When disabled, String type columns are excluded as possible foreign key candidates.

  • Apply bi-directional data validation: When enabled, column pairs are considered, with the values in column A also appearing in column B and the values in column B also appearing in column A. When disabled, one-way validation applies, requiring either the values from column A to exist in column B, and/or vice versa.

Applying filters reduces the list of foreign key candidates, but is likely to result in a higher proportion of foreign key suggestions that are correct.

Limitations of the foreign key scanner

It is important to consider several characteristics of the foreign key scanner:

  • Performance on Large Databases: Although the foreign key scanner is designed to operate in parallel for efficiency, scanning databases with hundreds of millions of rows will still take a considerable amount of time.

  • No Support for Composite Foreign Keys: The scanner doesn't take into account composite foreign keys; it only considers individual columns.

  • Assumptions on Database Structure: The scanner operates based on certain assumptions about the database, such as descriptive column names and correctly defined data types. If your database doesn't follow standard design best practices, you should disable the strict matching criteria for both column names and data types to allow the identification of more potential foreign key relationships.

  • Indeterminate Foreign Key Direction: Sometimes, Syntho may not be able to determine the direction of a foreign key relationship (i.e., whether column A points to column B or vice versa). In such cases, both options will appear in the foreign key list, and manual review is strongly advised to validate the results.

Understanding these limitations will help you use the foreign key scanner more effectively and be aware of its constraints.

PreviousAdd virtual foreign keysNextImport foreign keys via JSON

Last updated 12 months ago

Was this helpful?