Sample datasets

To provide users with practical examples for testing and analytics, we have selected datasets optimized for various scenarios. These datasets are sourced from well-known repositories and are designed to help users get started with Syntho's features effectively. For testing purposes, you can access a multi-table dataset, while for analytics, there is a single-table dataset. Additionally, a two-table sequence dataset is available for sequence-based modeling and evaluation. These datasets serve as a practical starting point for exploring Syntho's features and capabilities:

Census dataset

A screenshot from census dataset

Click below link to download .csv file.

Census dataset

COVID-19 dataset

  • Use Case: Useful for testing synthetic data generation on multi-table healthcare-related datasets.

  • Description: Includes tables such as patients, conditions, encounters etc. simulated for COVID-19 scenarios.

A screenshot from patients table

Click below link to download .zip file for 10k patient records with COVID-19 in the CSV format. If you would like to download 100k patient records version, please click here.

Covid datasets with 10k records

Baseball dataset

  • Use Case: Suitable for analytics and sequence-based data generation.

  • Description: Features player statistics and seasonal performance data.

A screenshot from players table
A screenshot from seasons table

Click below link to download .zip file.

Baseball dataset

Last updated

Was this helpful?