A transformation job generates a single SQL script to run in Snowflake. The SQL generated can be quite big and the performance of the query can improve significantly in some cases if we materialize the data in the middle of a transformation job. Then it would generate 2 scripts: one with the first transformations + INSERT INTO a temp table and another with the rest of the transformation but SELECTing from the temp table created in the previous step.
Have you ever considered to have an option (in JOIN component for example) to materialize/persist the output of it in a temp table? I thought about creating a shared job that would do it but it doesn't work because a Shared Job must be an Orchestration Job and we can't trigger an Orchestration Job from a Transformation Job.
Regards, Tiago Silva
3 Community Answers
Paul Johnson —
You are quite correct that materializing the data can improve performance in large complex jobs.
The way to do this currently is use a rewrite table component to materialize the results and a second transformation as using this table as input.
In your orchestration job you can use the begin and commit components to make them execute as a single transaction.
I will raise an improvement to be able to select the table type in a rewrite table component so the temporary and transient table types are available to save you needing a delete tables component.
Hi, Yes, but it would be nice if Matillion could do this for us behind the scenes. The way I imagine it is to have an option in the Join component like 'Materialize data: Yes/No' and that's it. Matillion would do all the magic for us.