- DarkLight
Data Transfer Object
- DarkLight
Data Transfer Component
The Data Transfer component enables users to transfer files from a chosen source to a chosen target.
This component can use a number of common network protocols to transfer data to a variety of sources. This component copies, not moves, the target file. Setting up this component requires selecting a source type and a target type. The component's other properties will change to reflect those choices.
Currently supported data sources include: Azure Blob Storage, Box, Dropbox, FTP, Google Cloud Storage, HDFS, HTTP, HTTPS, Microsoft SharePoint, Amazon S3, SFTP, and Windows Fileshare.
Currently supported targets: Azure Blob Storage, Google Cloud Storage, HDFS, Amazon S3, SFTP, and Windows Fileshare.
Please Note
- To ensure that instance credentials access is managed correctly at all times, we always advise that customers limit scopes (permissions) where applicable.
- FTPS is not supported through this component. We recommend using SFTP instead, or installing a tool that supports FTPS and calling it from a Bash Script component. You could also do this using cURL, which is available as standard.
- When reading from or writing from Windows Fileshare, the SMB2 protocol will be preferred. SMB1 is still supported.
- When the Source Type property is set to FTP, HTTP, HTTPS, SFTP, or Windows Fileshare, users will need to set a Source URL property—this property provides a template URL, with placeholder values inside square brackets []. Please replace the placeholder values with the values of your actual source URL.
- Below are listed the universal parameters for this component that are independent of Source and Target types. Then, property tables are given for source and target properties, grouped by their source and target Types.
Properties
Snowflake Properties | ||
---|---|---|
Property | Setting | Description |
Name | String | A human-readable name for the component. |
Source Type | Select | Select the type of data source. |
Unpack ZIP File | Yes/No | Select if the source data is a ZIP file that you wish to unpack before being transferred. |
Target Type | Select | Select the target type for the new file. |
Target Object Name | String | The filename of the new file. |
gzip Data | Yes/No | Select if you wish to gzip the transferred data when it arrives at the target. |
Redshift Properties | ||
---|---|---|
Property | Setting | Description |
Name | String | A human-readable name for the component. |
Source Type | Select | Select the type of data source. |
Unpack ZIP File | Yes/No | Select if the source data is a ZIP file that you wish to unpack before being transferred. |
Target Type | Select | Select the target type for the new file. |
Target Object Name | String | The filename of the new file. |
gzip Data | Yes/No | Select if you wish to gzip the transferred data when it arrives at the target. |
BigQuery Properties | ||
---|---|---|
Property | Setting | Description |
Name | String | A human-readable name for the component. |
Source Type | Select | Select the type of data source. |
Unpack ZIP File | Yes/No | Select if the source data is a ZIP file that you wish to unpack before being transferred. |
Target Type | Select | Select the target type for the new file. |
Target Object Name | String | The filename of the new file. |
gzip Data | Yes/No | Select if you wish to gzip the transferred data when it arrives at the target. |
Synapse Properties | ||
---|---|---|
Property | Setting | Description |
Name | String | A human-readable name for the component. |
Source Type | Select | Select the type of data source. |
Unpack ZIP File | Yes/No | Select if the source data is a ZIP file that you wish to unpack before being transferred. |
Target Type | Select | Select the target type for the new file. |
Target Object Name | String | The filename of the new file. |
gzip Data | Yes/No | Select if you wish to gzip the transferred data when it arrives at the target. |
Delta Lake Properties | ||
---|---|---|
Property | Setting | Description |
Name | String | A human-readable name for the component. |
Source Type | Select | Select the type of data source. |
Unpack ZIP File | Yes/No | Select if the source data is a ZIP file that you wish to unpack before being transferred. |
Target Type | Select | Select the target type for the new file. |
gzip Data | Yes/No | Select if you wish to gzip the transferred data when it arrives at the target. |
Target Object Name | String | The filename of the new file. |
Source Properties | |||
---|---|---|---|
Source | Property | Setting | Description |
Azure Blob Storage | Blob Location | Text | The URL, including full path and file name, that points to the source file that exists on Azure Blob Storage. When a user enters a forward slash character / after a folder name, a validation of the file path is triggered. This works in the same manner as the Go button. |
Box | Authentication | Select | Select the OAuth entry. OAuth entries should be set up in advance and can be created by first clicking the Manage button. For help configuring an entry, read our Box Extract Authentication Guide. |
File Id | String | The ID of the Box file. | |
Dropbox | Authentication | Select | Select the OAuth entry. OAuth entries should be set up in advance and can be created by first clicking the Manage button. For help configuring an entry, read our Dropbox Extract Authentication Guide. |
File Type | Select | Select either File or Paper. | |
Path | URL | The path to the Dropbox file. | |
File ID | String | The file ID | |
Download As Zip | Select | Select whether or not to download as a zip file. | |
Download Format | Select | Select HTML or Markdown. | |
FTP | Set Home Directory as Root | Yes/No | Yes: URLs are relative to the user's home directory. No: URLs are relative to the server root. |
Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. | |
Source Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Source Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. | |
Google Cloud Storage | Source URL | Text | The URL, including full path and file name, that points to the source file. When a user enters a forward slash character / after a folder name, a validation of the file path is triggered. This works in the same manner as the Go button. |
HDFS | Source URL | Text | The URL, including full path and file name, that points to the source file. |
HTTP | Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. |
Source Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Source Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. | |
HTTPS | Perform Certificate Validation | Choice | Check the SSL certificate for the host is valid before taking data. |
Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. | |
Source Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Source Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. | |
Microsoft Sharepoint | URL | Text | The URL is the web address that you visit to sign into your SharePoint account. |
User | Text | A valid SharePoint username to use for authentication. | |
Password | Text | A valid SharePoint password. Users have the option to store their password inside the component; however, we highly recommend using the Password Manager feature instead. | |
SharePoint Edition | Select | Select your edition of SharePoint. | |
Connection Options | Parameter/Value | Parameter: A JDBC parameter supported by the Database Driver. The available parameters are determined automatically from the driver, and may change from version to version. They are usually not required, since sensible defaults are assumed. Value: A value for the given parameter. | |
File Type | Select | Choose whether the file type is a document or attachment. | |
Library | Select | The location where you can upload, create, update, and collaborate on files. For more information, see here. | |
File URL | Text | The URL that points to the file. | |
S3 | Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. When a user enters a forward slash character / after a folder name, a validation of the file path is triggered. This works in the same manner as the Go button. |
SFTP | Set Home Directory as Root | Yes/No | Yes: URLs are relative to the user's home directory. No: URLs are relative to the server root. |
Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. The Source URL must start with sftp:// or else an authentication failure message will be returned. | |
Source Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Source Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. | |
Source SFTP Key | Text | This is your SFTP Private Key. It is optional and will only be used if the data source requests it. This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key. | |
Windows Fileshare | Source URL | Text | The URL, including full path and file name, that points to the source file. The source URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. |
Source Domain | Text | The domain that the source file is located on. | |
Source Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Source Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. |
Target Properties | |||
---|---|---|---|
Source | Property | Setting | Description |
Azure Blob Storage | Blob Location | Text | The full URL that points to the target file location on Azure Blob Storage. |
Google Cloud Storage | Target URL | Text | The URL (without file name) that points to where the new file will be created. The URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. |
HDFS | Target URL | Text | The URL (without file name) that points to where the new file will be created. The URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. |
S3 | Target URL | S3 Tree | The URL of the S3 bucket to get the files from. This follows the format s3://bucket-name/path |
Access Control List Options | Select | Choose from ACL settings that Amazon provide. Leaving it empty doesn't change the current settings. A full list can be found here. | |
Encryption | Select | Decide on how the files are to be encrypted inside the target S3 Bucket. None: No encryption. SSE Encryption: Encrypt the data according to a key stored on KMS. Read AWS Key Management Service (AWS KMS) to learn more. S3 Encryption: Encrypt the data according to a key stored on an S3 bucket. Read Using server-side encryption with Amazon S3-managed encryption keys (SSE-S3) to learn more. | |
KMS Key ID | Select | The ID of the KMS encryption key you have chosen to use in the 'Encryption' property. | |
SFTP | Set Home Directory as Root | Yes/No | Yes: URLs are relative to the user's home directory. No: URLs are relative to the server root. |
Target URL | Text | The URL (without file name) that points to where the new file will be created. The URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. The Target URL must start with sftp:// or else an authentication failure message will be returned. | |
Target Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Target Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. | |
Target SFTP Key | Text | This is your SFTP Private Key. It is optional and will only be used if the data source requests it. This must be the complete private key, beginning with "-----BEGIN RSA PRIVATE KEY-----" and conforming to the same structure as an RSA private key. | |
Windows Fileshare | Target URL | Text | The URL (without file name) that points to where the new file will be created. The URL template includes placeholder values inside either square brackets [] or angled brackets < >. Replace the placeholder values with the actual values of your source URL. |
Target Domain | Text | The domain that the newly created file is to be located on. | |
Target Username | Text | This is your URL connection username. It is optional and will only be used if the data source requests it. | |
Target Password | Text | This is your URL connection password. It is optional and will only be used if the data source requests it. |
Copying Files to an Azure Premium Storage blob
When copying files to an Azure Premium Storage blob, Matillion may provide the following error:
Self-suppression not permitted.
This is because, unlike standard Azure Storage, Azure Premium Storage does not support block blobs, append blobs, files, tables, or queues. Premium Storage supports only page blobs that are incrementally sized.
A page blob is a collection of 512-byte pages that are optimised for random read and write operations. Thus, all writes must be 512-byte aligned and so any file that is not sized to a multiple of 512 will fail to write.
For additional information about Azure Storage blobs, read Understanding block blobs, append blobs, and page blobs.