Importing Data into Databricks
Last updated
Was this helpful?
Last updated
Was this helpful?
Once your synthetic data is written as Parquet files to a storage location (Local Filesystem, Azure Data Lake Storage (ADLS), or Amazon S3), follow these steps to import it back into Databricks:
Access Databricks workspace: Go to your Databricks workspace and navigate to the Data tab.
Select your data source:
For ADLS, select Azure Data Lake.
For Amazon S3, choose Amazon S3.
If using the Local Filesystem, upload your files to a cloud storage service like ADLS or S3 first.
Mount the storage: Mount your cloud storage (ADLS or S3) to Databricks following the for ADLS or .
Read the parquet files: Use the Databricks Data tab or a notebook to load Parquet files into a DataFrame. For details, check the .
Create or register a table: Use Databricks SQL commands or the user interface to create a temporary or permanent table from the loaded data. Parquet files generated by Syntho can be loaded into Databricks using standard Databricks SQL commands. For example:
This command creates a table from the specified Parquet file. Refer to the for more information on managing tables.