LogoLogo
Go to Syntho.AI
English
English
  • Welcome to Syntho
  • Overview
    • Get started
      • Syntho bootcamp
        • 1. What is Syntho?
        • 2. Introduction data anonymization
        • 3. Connectors & workspace creation
        • 4. PII scan
        • 5. Generators
          • Mockers
          • Maskers
          • AI synthesize
          • Calculated columns
          • Free text de-identification
        • 6. Referential integrity & foreign keys
        • 7. Workspace synchronization & validation
        • 8. Workspace & user management
        • 9. Large workloads​
        • 10. Data pre-processing
        • 11. Continuous Success
      • Prerequisites
      • Sample datasets
      • Introduction to data generators
    • Frequently asked questions
  • Setup Workspaces
    • View workspaces
    • Create a workspace
      • Connect to a database
        • PostgreSQL
        • MySQL / MariaDB
        • Oracle
        • Microsoft SQL Server
        • DB2
        • Databricks
          • Importing Data into Databricks
        • Hive
        • SAP Sybase
        • Azure Data Lake Storage (ADLS)
        • Amazon Simple Storage Service (S3)
      • Workspace modes
    • Edit a workspace
    • Duplicate a workspace
    • Transfer workspace ownership
    • Share a workspace
    • Delete a workspace
    • Workspace default settings
  • Configure a Data Generation Job
    • Configure table settings
    • Configure column settings
      • AI synthesize
        • Sequence model
          • Prepare your sequence data
        • QA report
        • Additional privacy controls
        • Cross-table relationships limitations
      • Mockers
        • Text
          • Supported languages
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • Other
      • Mask
        • Text
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • UUID
      • Duplicate
      • Exclude
      • Consistent mapping
      • Calculated columns
      • Key generators
        • Differences between key generators
      • JSON de-identification
    • Manage personally identifiable information (PII)
      • Privacy dashboard
      • Discover and de-identify PII columns
        • Identify PII columns manually
        • Automatic PII discovery with PII scanner
      • Remove columns from PII list
      • Automatic PII discovery and de-identification in free text columns
      • Supported PII & PHI entities
    • Manage foreign keys
      • Foreign key inheritance
      • Add virtual foreign keys
        • Add virtual foreign keys
        • Use foreign key scanner
        • Import foreign keys via JSON
        • Export foreign keys via JSON
      • Delete foreign keys
    • Validate and Synchronize workspace
    • View and adjust generation settings
  • Deploy Syntho
    • Introduction
      • Syntho architecture
      • Requirements
        • Requirements for Docker deployments
        • Requirements for Kubernetes deployments
      • Access Docker images
        • Online
        • Offline
    • Deploy Syntho using Docker
      • Preparations
      • Deploy using Docker Compose
      • Run the application
      • Manually saving logs
      • Updating the application
    • Deploy Syntho using Kubernetes
      • Preparations
      • Deploy Ray using Helm
        • Upgrading Ray CRDs
        • Troubleshooting
      • Deploy Syntho using Helm
      • Validate the deployment
      • Troubleshooting
      • Saving logs
      • Upgrading the applications
    • Manage users and access
      • Single Sign-On (SSO) in Azure
      • Manage admin users
      • Manage non-admin users
    • Logs and monitoring
      • Does Syntho collect any data?
      • Temporary data storage by application
  • Syntho API
    • Syntho REST API
Powered by GitBook
On this page
  • Before you begin
  • Connect and set up the workspace
  • Considerations: Handling Hive database partitioning
  • Supported data types

Was this helpful?

  1. Setup Workspaces
  2. Create a workspace
  3. Connect to a database

Hive

PreviousImporting Data into DatabricksNextSAP Sybase

Last updated 8 days ago

Was this helpful?

Syntho Beta feature

Important

This connector can only be used as a source database. The generated data can be written to or as Parquet files.

Before you begin

Before you begin, gather this connection information:

  • Name of the server that hosts the database you want to connect to and port number

  • User name and password

  • Are you connecting to an SSL server?

Connect and set up the workspace

Launch Syntho and select Connect to a database, or Create workspace. Then select Hive from Type under The connection details. For a complete list of data connections, click Type under The connection details. Then do the following:

  1. Enter the name of the server that hosts the database and the port number to use.

  2. Optionally, enter the schema name.

  3. Enter user name and password.

    Select the Require SSL check box when connecting to an SSL server.

  4. Select Create Workspace.

    If Syntho can't make the connection, verify that your credentials are correct. If you still can't connect, your computer is having trouble locating the server. Contact your network administrator or database administrator.

Considerations: Handling Hive database partitioning

Hive only

In Hive databases, source tables are often partitioned based on three columns treated as index columns. These columns are used for ordering in queries, but they do not always form unique composites.

Supported data types

Data Type
AI-powered Generation
Mockers
Mask
Calculated Columns

TINYINT

SMALLINT

INT

BIGINT

FLOAT

DOUBLE

DECIMAL

TIMESTAMP

False

DATE

False

STRING

VARCHAR

CHAR

BOOLEAN

False

False

False

BINARY

False

False

False

False

ARRAY

MAP

STRUCT

*Some data types are not actively supported; however, certain generators such as AI synthesize, mask, mockers, or calculated columns may still show 'True' for these fields. This means the generators can be applied, even though the types are not actively supported. Duplication is fully supported for these data types.

To address this, use the partitioning columns along with the additional columns specified through the "ORDER BY" dropdown. This approach ensures unique and consistent ordering, leveraging both the partitioning logic and user-defined columns. For more information, check in table settings.

True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
Azure Data Lake Storage (ADLS)
Amazon Simple Storage Service (S3)
ORDER BY
Source and Destination Databases
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True