Refresh External Table
Refresh External Table
Refreshes an External Table, resyncing it with the target data. This component should ideally always come before an external table is used but is critical when the target data has been changed since the corresponding table was last used.
|Name||Text||The descriptive name for the component.|
|Database||Select||Choose a database to create the new table in.|
|Schema||Select||Select the table schema. The special value [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.|
|Target Table||Text||The name of the External Table to be created or used.|
|Relative Subpath||Text||The directory path to follow to the target data. File names cannot be specified. The path is relative to the storage location given in the External Stage setup.|
In this example we use External Tables to load CSV data from S3 and transform it into a more useful form before creating a permanent table from the result. To accomplish this, we use an Orchestration and a Transformation job, shown below, respectively.
First, we have made an External Stage and pointed it to an S3 bucket that contains our CSV data.
Next, we have configured the Create External Table Component to use the "csv" directory at that S3 Bucket URL. Be warned, this will pull in all CSV files from that directory. Since the output are variants, the CSV format does not matter. However, it is usually most useful to ensure all CSV files present are of the same format and that they are wanted in the data load. We have instructed the component to create an External Table named "Doc_Ext".
After this, the External Table is refreshed using the Refresh External Table Component before a new, regular table is created with the expected columns set up. This job then links into the Transformation job shown previously.
The data can be sampled using a Table Input component pointed at "Doc_Ext", our new external table (providing the Orchestration job has run at least once). The sampled data is shown below.
As can be seen, this data has loaded correctly but is not in the most useful format. External Data is loaded as a variant of the target data. We can unpack this into a more useful form using the Flatten Variant Component, the column mapping of which is shown below.
Note that there is only one source column, "VALUE". We identify "col1" and "col2" as Properties within that column and give them an alias describing their true purposes. Their values will then be extracted and placed into columns named using those aliases, the final result of which is shown below.
Linking a Table Output Component to the end of the workflow allows us to now save this transformed data in the permanent, regular table that we created earlier. Meanwhile, our External Table remains intact, ready to be used again should the external data be updated.