How to Create Your Own Matillion ETL Extract Connector
    • Dark
      Light

    How to Create Your Own Matillion ETL Extract Connector

    • Dark
      Light

    Article Summary

    This article is specific to the following platforms - Redshift - Snowflake - BigQuery.

    Overview

    Using Matillion ETL's API Extract component in conjunction with the Manage Extract Profiles wizard empowers users to create their own Matillion ETL data connector and load data from any data source using the relevant API.

    This guide, with a supplementary video guide at the bottom, will explain how to create a custom Matillion ETL data connector.


    Geting Started - Creating A New Extract Profile

    First, an extract profile must be set up.

    1. Click Project, and then click Manage Extract Profiles to open the Manage Extract Profiles dialog.

    In this dialog, users can manage all of their extract profiles.

    2. In the Manage Extract Profiles dialog, click to add a new extract profile. This action launches the Add Extract Profile dialog, in which users must provide a name for their new extract profile. We recommend a human-readable name that is easy to identify.

    In this example, the extract profile is named Coin Price.

    3. Next, click Setup to begin configuring your new extract profile.

    4. In the Configure Extract Profiles dialog, click New Endpoint to enter the endpoint setup wizard.


    1. Source Details

    The first page of the wizard requires basic source information:

    Profile Name

    This setting has already been set, it is the extract profile name—in this case, Coin Price.


    Endpoint Name

    Specify a name for the endpoint to be created—this is the data source.


    Description

    Provide a human-readable description of the endpoint for reference. This setting is optional.

    Click Next.


    2. Endpoint Configuration

    The second page of the wizard requires information about the API call you wish to make.

    Specify the method. Users can select between GET and POST.

    Input the endpoint URI. This will be available from the source of the API. In this case, the endpoint is: https://api.coindesk.com/v1/bpi/currentprice.json


    Params

    Specify any parameters to validate the API call. Users can pass a Parameter Name, Value, Type, and specify whether the parameter is a Constant or not.


    Auth

    Tick Enable Authentication if authentication is required.

    Then, select "Username and Password" or "API Key" from the first dropdown. Input the credentials into the correct fields.

    Select between Basic and Bearer in the second dropdown.


    Body

    Specify, if required, the body of the request.


    Response

    Here the API call response is displayed. When a user clicks Send, the Response tab is automatically brought into view. The following message is displayed for validated API calls:

    JSON File validated successfully with no errors.


    Log

    A log of activity.


    After specifying the method, the URI, and any required elements, click Send. As mentioned above, the Response tab will be displayed. Upon validation, click Next.


    3. View Fields

    On the third page of the wizard, users can view the structure of the data returned from the API. Matillion ETL will attempt to identify the Repeating Element from the data. However, users can also specify this themselves.

    Users can also configure paging by first toggling Paging to "ENABLED" and then selecting one of the supported paging schemes and providing the required parameters.

    Click Next.


    4. Review

    The final page of the wizard is for reviewing the the configuration of the new endpoint. Publishing the configuration will make the data source immediately available for your API Extract profile.

    Click Back to make changes. Otherwise, click Finish to complete the API extract profile setup.


    Creating the Orchestration Job

    1. In a new Orchestration Job, add the API Extract component onto the Matillion ETL canvas.

    2. Click onto the API Extract component, and click into the API property. From the dropdown menu, select the newly created API endpoint—in this example, it's Coin Price.

    3. Click into the Data Source property and from the dropdown select the data source. The data source is the Endpoint Name from the Manage Extract Profiles wizard—in this case, it is also named Coin Price.

    Any parameters configured during Endpoint Configuration will be populated in the corresponding Params properties.

    5. Select the storage location for your data.

    6. Once the remaining properties are configured, right click the API Extract component, and click Run Job. Alternatively, with the component selected, press CTRL + ENTER.


    Transforming the Data

    In the next section of this example, we will use a Matillion ETL Transformation Job to calculate the Candian Dollars price of Bitcoin.

    First, another API profile must be set up for the current US Dollars to Canadian Dollars exchange rate. The setup of this extract profile follows the same steps as outlined so far in this article for the previous API extract profile. The endpoint for this second extract profile is https://api.exchangerate-api.com/v4/latest/USD

    For the Orchestration Job, we have a Start component connected to two API Extract components. The first is for the Coin Price API; and the second is for the Exchange Rates API.

    Since these data staging components are Extract components, they may return nested data that requires flattening.

    Next, a Transformation Job is created. In this job, there are two Table Input components, one calling each new table made in the Orchestration Job. Then, a Convert Type component is used to convert the data type of the columns required to later run an SQL query that will calculate the Canadian Dollars price of Bitcoin comparative to USD. Finally, a Create View component is connected to each Convert Type component, outputting a view definition to our database rather than writing the data to a physical table.

    Finally, there is an SQL component to run our SELECT query.

    The SQL query and the data returned can be found below, where we have calculated the price of Bitcoin in Canadian Dollars.

    Below, users can copy the SQL query for their own work if required.

    SELECT
    t1."bpi_usd_rate_float",
    t2."rates_cad",
    (t1."bpi_usd_rate_float" * t2."rates_cad") as CAD_Coin_Price
    FROM "exchange_rate_new" t2, "coin_price_new" t1
    group by 1,2
    


    Video