LogoLogo
Go to Syntho.AI
English
English
  • Welcome to Syntho
  • Overview
    • Get started
      • Syntho bootcamp
        • 1. What is Syntho?
        • 2. Introduction data anonymization
        • 3. Connectors & workspace creation
        • 4. PII scan
        • 5. Generators
          • Mockers
          • Maskers
          • AI synthesize
          • Calculated columns
          • Free text de-identification
        • 6. Referential integrity & foreign keys
        • 7. Workspace synchronization & validation
        • 8. Workspace & user management
        • 9. Large workloads​
        • 10. AI synthesis: Data pre-processing when using
      • Prerequisites
      • Sample datasets
      • Introduction to data generators
      • AI-generated synthetic data
    • Frequently asked questions
  • Setup Workspaces
    • View workspaces
    • Create a workspace
      • Connect to a database
        • PostgreSQL
        • MySQL / MariaDB
        • Oracle
        • Microsoft SQL Server
        • DB2
        • Databricks
          • Importing Data into Databricks
        • Hive
        • SAP Sybase
        • Azure Data Lake Storage (ADLS)
        • Amazon Simple Storage Service (S3)
      • Workspace modes
    • Edit a workspace
    • Duplicate a workspace
    • Transfer workspace ownership
    • Share a workspace
    • Delete a workspace
    • Workspace default settings
  • Configure a Data Generation Job
    • Configure table settings
    • Configure column settings
      • AI synthesize
        • Sequence model
          • Prepare your sequence data
        • QA report
        • Additional privacy controls
        • Cross-table relationships limitations
      • Mock
        • Text
          • Supported languages
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • Other
      • Mask
        • Text
        • Numeric (integer)
        • Numeric (decimal)
        • Datetime
        • UUID
      • Duplicate
      • Exclude
      • Consistent mapping
      • Calculated columns
      • Key generators
        • Differences between key generators
      • JSON de-identification
    • Manage personally identifiable information (PII)
      • Privacy dashboard
      • Discover and de-identify PII columns
        • Identify PII columns manually
        • Automatic PII discovery with PII scanner
      • Remove columns from PII list
      • Automatic PII discovery and de-identification in free text columns
      • Supported PII & PHI entities
    • Manage foreign keys
      • Foreign key inheritance
      • Add virtual foreign keys
        • Add virtual foreign keys
        • Use foreign key scanner
        • Import foreign keys via JSON
        • Export foreign keys via JSON
      • Delete foreign keys
    • Validate and synchronize workspace
    • View and adjust generation settings
  • Deploy Syntho
    • Introduction
      • Syntho architecture
      • Requirements
        • Requirements for Docker deployments
        • Requirements for Kubernetes deployments
      • Access Docker images
        • Online
        • Offline
    • Deploy Syntho using Docker
      • Preparations
      • Deploy using Docker Compose
      • Run the application
      • Manually saving logs
      • Updating the application
      • Backup
    • Deploy Syntho using Kubernetes
      • Preparations
      • Deploy Ray using Helm
        • Upgrading Ray CRDs
        • Troubleshooting
      • Deploy Syntho using Helm
      • Validate the deployment
      • Troubleshooting
      • Saving logs
      • Upgrading the applications
      • Backup
    • Manage users and access
      • Single Sign-On (SSO) in Azure
      • Manage admin users
      • Manage non-admin users
    • Logs and monitoring
      • Does Syntho collect any data?
      • Temporary data storage by application
  • Syntho API
    • Syntho REST API
Powered by GitBook
On this page
  • Hasher
  • Numeric Noise
  • Random Character Swap

Was this helpful?

  1. Configure a Data Generation Job
  2. Configure column settings
  3. Mask

Numeric (integer)

PreviousTextNextNumeric (decimal)

Last updated 28 days ago

Was this helpful?

Below is a list of available numeric (integer) mask functions.

Hasher

The Hasher function uses the Hasty Pudding Cipher algorithm to create a one-to-one mapping between input and hashed values, ensuring consistent anonymization. It maintains the sign of numbers, always hashing negative values to negative outputs and positive values to positive outputs, using an internal encoding mechanism. This method ensures stable, deterministic, and repeatable transformations, making it ideal for anonymization while preserving numerical relationships. To ensure accurate ordering, please see .

Parameters

  • No parameters.

Note: The default fallback range aligns with 32-bit integer limits (-2,147,486,647 to 2,147,486,647), though actual range depends on database support. Note that 0 is never hashed.

Example

If you configure:

Column names:
2002,
1944,
2002,
...

The results will be:

Column names:
1962697134,
943111608,
1962697134,
...

Numeric Noise

Adds noise to numeric data based on a uniform distribution, ensuring that the values are randomized while preserving the overall structure of the dataset. This is useful for anonymizing numerical fields where consistency and distribution must be maintained.

Parameters

  • Maximum negative noise: The smallest amount the date can be adjusted, relative to the original date.

    • Use negative numbers to shift the date into the past.

      • Example: If the date part is set to "Day" and the minimum shift is set to -5, this ensures the date will not be shifted earlier than 5 days prior to the original date.

      • A positive number shifts the date forward.

      • Example: If the minimum shift is 5, the date will not shift earlier than 5 days after the original date.

  • Maximum positive noise: The largest amount by which the date can be adjusted from the original value.

    • Use positive numbers to shift the date into the future.

      • Example: If the date part is set to "Day" and the maximum shift is set to 5, the date will not be shifted later than 5 days after the original date.

  • Noise type: The unit of time (Additive, Multiplicative, Absolute) that will define the granularity of the shift.

    • Additive: If the data has a value of x, we will add random noise within the range -10% to +10% of x's absolute value.

    • Multiplicative: Multiply the value by a random factor within the range -5 to 5.

    • Absolute: Add direct random noise to the value within the range from -5 to 5.

    • The selected unit will be applied to both the minimum and maximum shift fields

Example

If you configure:

Maximum negative noise: 3
Maximum positive noise: 3
Noise type: Absolute

2173,
2090,
2227,
...

The results will be:

2170,
2088,
2227,
...

Random Character Swap

The Random Character Swap function replaces individual characters in categorical values while preserving the structure of punctuation, spaces, and symbols. Characters are swapped within their respective categories (letters with letters, digits with digits), ensuring that the field's overall format remains usable, or in other words, the original data type and structure of each field (letters, numbers, symbols) are preserved. Note that it preserves non-alphabetic characters (e.g., punctuation, spaces)​.

Parameters

Example

If you configure:

Column names:
Mavis612,
Frank378,
Tijuana228,
...

The results will be:

Column names:
Eiqxj928,
Wawak904,
Rqrsuzb283,
...

Consistent mapping: Numeric Noise supports .

Consistent mapping: Random Character Swap supports .

consistent mapping
consistent mapping
Hasher
Numeric Noise
Consistent Mapping enabled for Random Character Swap
ordering and indexing considerations