LogoLogo
Go to Syntho.AI
English
English
  • Welcome to Syntho
  • Overview
    • Get started
      • Syntho bootcamp
        • 1. What is Syntho?
        • 2. Introduction data anonymization
        • 3. Connectors & workspace creation
        • 4. PII scan
        • 5. Generators
          • Mockers
          • Maskers
          • AI synthesize
          • Calculated columns
          • Free text de-identification
        • 6. Referential integrity & foreign keys
        • 7. Workspace synchronization & validation
        • 8. Workspace & user management
        • 9. Large workloads​
        • 10. Data pre-processing
        • 11. Continuous Success
      • Prerequisites
      • Sample datasets
      • Introduction to data generators
    • Frequently asked questions
  • Setup Workspaces
    • View workspaces
    • Create a workspace
      • Connect to a database
        • PostgreSQL
        • MySQL / MariaDB
        • Oracle
        • Microsoft SQL Server
        • DB2
        • Databricks
          • Importing Data into Databricks
        • Hive
        • SAP Sybase
        • Azure Data Lake Storage (ADLS)
        • Amazon Simple Storage Service (S3)
      • Workspace modes
    • Edit a workspace
    • Duplicate a workspace
    • Transfer workspace ownership
    • Share a workspace
    • Delete a workspace
    • Workspace default settings
  • Configure a Data Generation Job
    • Configure table settings
    • Configure column settings
      • AI synthesize
        • Sequence model
          • Prepare your sequence data
        • QA report
        • Additional privacy controls
        • Cross-table relationships limitations
      • Mockers
        • Text
          • Supported languages
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • Other
      • Mask
        • Text
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • UUID
      • Duplicate
      • Exclude
      • Consistent mapping
      • Calculated columns
      • Key generators
        • Differences between key generators
      • JSON de-identification
    • Manage personally identifiable information (PII)
      • Privacy dashboard
      • Discover and de-identify PII columns
        • Identify PII columns manually
        • Automatic PII discovery with PII scanner
      • Remove columns from PII list
      • Automatic PII discovery and de-identification in free text columns
      • Supported PII & PHI entities
    • Manage foreign keys
      • Foreign key inheritance
      • Add virtual foreign keys
        • Add virtual foreign keys
        • Use foreign key scanner
        • Import foreign keys via JSON
        • Export foreign keys via JSON
      • Delete foreign keys
    • Validate and Synchronize workspace
    • View and adjust generation settings
  • Deploy Syntho
    • Introduction
      • Syntho architecture
      • Requirements
        • Requirements for Docker deployments
        • Requirements for Kubernetes deployments
      • Access Docker images
        • Online
        • Offline
    • Deploy Syntho using Docker
      • Preparations
      • Deploy using Docker Compose
      • Run the application
      • Manually saving logs
      • Updating the application
    • Deploy Syntho using Kubernetes
      • Preparations
      • Deploy Ray using Helm
        • Upgrading Ray CRDs
        • Troubleshooting
      • Deploy Syntho using Helm
      • Validate the deployment
      • Troubleshooting
      • Saving logs
      • Upgrading the applications
    • Manage users and access
      • Single Sign-On (SSO) in Azure
      • Manage admin users
      • Manage non-admin users
    • Logs and monitoring
      • Does Syntho collect any data?
      • Temporary data storage by application
  • Syntho API
    • Syntho REST API
Powered by GitBook
On this page
  • 1. Deployment and Initial Setup
  • 2. Database Access Configuration
  • 3. Workspace Configuration
  • 4. Data Alignment and Preparation

Was this helpful?

  1. Overview
  2. Get started

Prerequisites

Previous11. Continuous SuccessNextSample datasets

Last updated 3 months ago

Was this helpful?

This page provides a checklist of requirements to help you to prepare for data generation jobs. Completing these steps ensures the Syntho application is correctly configured, accessible, and ready for data generation and management. Please follow each step to confirm your setup meets all prerequisites.

1. Deployment and Initial Setup

  • Application Deployment: Ensure the application is deployed successfully, and the UI is accessible. Verify that the first admin user can log in, see for more information.

  • User Accounts and Access Management:

    • Ensure that admin and user accounts are set up.

    • Credentials (username/password) are distributed securely, adhering to internal policies of .

2. Database Access Configuration

  • Source Database: Confirm that day-to-day users have read-only access to the source database. Write or other permissions should not be required.

  • Destination Database: Ensure users have access to the destination database, including the ability to perform operations such as table truncation, which is necessary for multiple data generation runs.

3. Workspace Configuration

  • Workspace Creation: Create a workspace following the .

  • Database Connection Test: Verify that the source database is available and accessible within the Syntho application by performing a connection test:

    • In the Syntho application, navigate to Workspace Settings and select Database Connections.

    • Input the connection details for both source and destination databases and select Test Connection.

    • Ensure a successful connection is indicated by a green checkmark, confirming access to both databases.

4. Data Alignment and Preparation

  • Data Types Alignment: Ensure data type consistency between source and destination databases. Data types should be appropriately set:

    • Date columns as Date

    • Integer columns as Integer

    • Decimal columns as Decimal/Float

  • Schema Consistency: The destination database must have the same tables and columns as the source database but should remain empty, with write access enabled.

  • Data Integrity: Given that the source database may change during a data generation run, referential integrity errors are likely to occur. To prevent this, it is recommended to create a dump of the production database prior to starting the de-identification process. This ensures consistency and reduces the risk of errors during the run.

Deployment Guide
User Management Guide
Workspace Setup Guide