(Temporary) data storage by application

The Syntho platform processes data securely, within the secure infrastructure of the customer. Below is an overview of how any temporary files are handled.

Temporary Files in Use

  1. Parquet Files:

    • Purpose: Used as staging data prior to writing the processed data to the destination.

    • Contents: Contains generated data during intermediate stages of processing.

    • Lifecycle:

      • These files are created temporarily and are designed to be deleted upon successful completion, cancelling, or failure of the processing job.

      • In case of an unexpected application failure, these files may remain in the internal storage until the application is restarted. After the restart they will be removed automatically.

    • Security Controls:

      • Access Controls: The internal storage location is secured with restricted access controls to protect the parquet files during their temporary existence.

      • Cleanup Mechanisms: Application-level watchdogs and cleanup routines exist to mitigate the risk of files persisting unnecessarily.

  2. Engine JSON Files:

    • Purpose: Generated to encapsulate job configuration metadata before submission to the Ray cluster for distributed processing.

    • Contents: Contains only non-sensitive metadata (e.g., configuration details and application runtime information).

    • Lifecycle:

      • These files are typically ephemeral and removed once the job is submitted.

      • If the application crashes at a specific processing point, the engine JSON file with public application data may persist.

    • Security Consideration:

      • As the file contains only public and non-sensitive data, it poses no security risk if retained within the secure infra of the customer.

Last updated