# Azure Data Lake Storage (ADLS)

<figure><img src="https://content.gitbook.com/content/U61B9DqtWCNO3Z30vnjh/blobs/pyO1QfwNkgVLRMAJnkwQ/AzureDataLakeStorage(ADLS).png" alt=""><figcaption><p>Source and Destination Databases</p></figcaption></figure>

{% hint style="info" %}
**Destination only**

This connector can only be used as a destination for writing your generated data.

* Supported File Types: Parquet and ORC
* Supported Partitioning: Horizontal partitioning based on the write batch size (i.e. each batch will be written to a separate file). Please also give an example of file output structure.
  {% endhint %}

## Before you begin <a href="#before-you-begin" id="before-you-begin"></a>

Before you begin, review the items in the following list:

* Get the URL for your **Azure** storage endpoint.
* Depending on how you want to connect, prepare either:
  * the storage account name and storage account key, *or*
  * the connection string.
* Provide read/write permissions to the storage container that is used (i.e., the storage container name).
* Get the remote path, which is the relative path to the storage container that is used.

## File formats

Supported file type formats include:

* Parquet
* ORC

## Output format

Syntho's ADLS output connector will write all generated data to **Parquet** files as follows:

* Each generated table will be written to a Parquet file in the following format:\
  `{schema-name}-{table_name}_part_{part_number}.parquet`
* The number of rows in a single Parquet file (part) is defined by the `batch_generate` size. All the Parquet parts of a single table will be stored in their own directory, which is dedicated to that particular table.
* Each folder name will use the following format:

  `{schema_name}.{table_name}`

## Connect and set up the workspace

Launch Syntho and select **Connect to a database**, or under **Create workspace > Destination Database**, select **ADLS**. For a complete list of data connections, select **More** under **From database**. Then do the following:

1. Enter the remote path.
2. Enter the storage container name.
3. Either:
   * Enter the storage account name and the storage account key, *or:*
   * Enter the storage connection string.
4. Select **Create Workspace**.\
   If Syntho can't make the connection, verify that your credentials are correct. If you still can't connect, your computer is having trouble locating the server. Contact your network administrator or database administrator.

## **Supported data types**

* The supported data types for ORC files are specified in the Apache Arrow documentation.

| Logical type       | Mapped Arrow type                  |
| ------------------ | ---------------------------------- |
| BOOLEAN            | Boolean                            |
| BYTE               | Int8                               |
| SHORT              | Int16                              |
| INT                | Int32                              |
| LONG               | Int64                              |
| FLOAT              | Float32                            |
| DOUBLE             | Float64                            |
| STRING             | String/LargeString                 |
| BINARY             | Binary/LargeBinary/FixedSizeBinary |
| TIMESTAMP          | Timestamp/Date64                   |
| TIMESTAMP\_INSTANT | Timestamp                          |
| LIST               | List/LargeList/FixedSizeList       |
| MAP                | Map                                |
| STRUCT             | Struct                             |
| UNION              | SparseUnion/DenseUnion             |
| DECIMAL            | Decimal128/Decimal256              |
| DATE               | Date32                             |
| VARCHAR            | String                             |
| CHAR               | String                             |

Errors can occur during data conversion when writing to ORC files if unsupported data types are involved.

## Limitations & considerations

Contact your Syntho contact person to discuss possible limitations regarding this connector.

* For ORC files, columns full of None values which are of type Char, String or Varchar will be written as "None" (i.e. a string value) to the destination database instead of None.
