Next, we can configure the dataSourceDefinitions asset which defines how data is ingested and from which external data source. Here, we:

  • Specify what provider to extract data from.
  • Configure the visual template for the data source (steps in Data Source).
  • Define the queries that are executed against the external data source to retrieve the metadata.
Tip

Download the full examples we used in this tutorial to follow along: - Recipe Manager (version data ingestion): Recipe Manager (Metadata Application).json - SQL script with the example database: Recipe SQL Script.sql

Let’s start with the basic template:

"dataSourceDefinitions": [
    {
        "key": "",
        "name": "",
        "description": "",
        "provider": "",
        "queries": [
            {
                "key": "",
                "abbreviation": "",
                "description": "",
                "type": ""
            }
        ],
        "versions": [
            {
                "key": "default",
                "name": "Version 1.0",
                "options": { }
                "queryDefinitions": [ ],
                "template": { },
            }
        ]
    }
],
PropertyDescription
keyUnique key of the data source definition. We use this to link the data source to the right dataIntegrations asset.
nameName of the data source definition, used as default EN translation in Dawiso.
descriptionDescription of the data source definition. The field isn’t compulsory but it can help your teams understand the purpose of the asset immediately.
providerExternal provider where metadata is ingested from. You can create a new data source that will ingest data from the following providers: core_amazon_redshift: Amazon Redshift, core_databricks: Databricks, core_dbt: dbt, core_google_bigquery: Google Big Query, core_keboola: Keboola, core_mongodb: MongoDB, core_mysql: MySQL, core_oracle: Oracle, core_postgresql: PostgreSQL, core_power_bi: Power BI, core_sap_hana: SAP HANA, core_snowflake: Snowflake, core_sql_server: SQL Server, core_tableau: Tableau For an up-to-date list of available providers, see the full documentation on Data Source Definitions.
queriesQueries which specify what metadata to retrieve. Each query retrieves one object type or relation type. Their keys will be used to assign query definitions to them. Query definitions are defined under the versions property as their structure may vary depending on the provider version.
versionsVersion of the data source. Here is where we define
as different versions of the provider may require different query syntax., Template of the data source creation step in Dawiso UI.
Warning

Carefully define the query order. For example, if Recipe Steps are ingested first, they cannot be properly categorized into Recipes.

Click here to hide the example.In our Recipe Manager, the asset is configured like the following:

"dataSourceDefinitions": [
    {
        "key": "data_source_definition",
        "name": "Recipes Manager Data Source",
        "description": "Loading recipes to Dawiso from SQL Server sample",
        "provider": "core_sql_server",
        "queries": [...],
        "versions": [...]
]

Queries

First, create queries. Each query corresponds to one object type or a relation type.

"queries": [
    {
        "key": "",
        "abbreviation": "",
        "description": "",
        "type": "",
    }
],
PropertyDescription
keyUnique key of the query. Used to assign a query definition to the query. One query can have multiple definitions (usually, they differ according to the provider version).
abbreviationAbbreviation of the query. Usually, the same value as the key.
descriptionDescription of the query, used for internal purposes.
typeType of the data the query is ingesting. Supported values are:object: For ingesting object types., relation: For ingesting relations between objects.
Click here to hide the example.In the example package, Cuisine and Recipequeries look like this:
"queries": [
    {
        "key": "cuisine",
        "abbreviation": "cuisine",
        "description": "",
        "type": "object"
    },
    {
        "key": "recipe",
        "abbreviation": "recipe",
        "description": "",
        "type": "object"
    },
    ...
]

Versions

As providers can have multiple functioning versions, queries and data source creation steps (fields to be filled in in the UI, see 2.2 Version Template) can be defined for each of them.

First, let’s take a look at the version property’s template.

"versions": [
    {
        "key": "default",
        "name": "Version 1.0",
        "options": {
            "batchSize": 5000,
            "ingestionFormat": "diff",
            "convertBooleanToNumeric": true,
            "createVersions": false
        }
        "queryDefinitions": [
            {
                "queryKey": "",
                "definition": "",
                "format": "json_lite",
                "order": 1,
                "fields": [
                    { "key": "", "isKey": true },
                    { "key": "", "isParentKey": true },
                    { "key": "", "isName": true },
                    { "key": "" }
                ],
                "options": { }
            }
        ],
        "template": {
            "$schema": "https://schema.dawiso.com/provider-schema.json",
            "providerName": "",
            "steps": [ ]
        },
    }
],
PropertyDescription
keyUnique key of the provider version.
nameName of the version.
optionsOptional settings that are specific to each provider. For more information, see the options documentation. The following properties are mandatory:batchSize: Number or rows per extraction job., ingestionFormat: Format of the ingestion. Only json_lite is supported., convertBooleanToNumeric: Converts boolean values (true, false) to numeric format (0, 1)., createVersions: Creates and stores versions of the data source definitions.
queryDefinitionsSQL syntax for retrieving the data, which is linked to a specific query using the queryKey property.
templateConfigures how each data source creation step looks and what information the user must input in Dawiso UI.
Click here to hide the example.The property in the example package is almost identical to the provided template.
"versions": [
    {
        "key": "default",
        "name": "Version 1.0",
        "options": {
            "batchSize": 5000,
            "convertBooleanToNumeric": true,
            "createChangeLogs": false,
            "createVersions": false,
            "ingestionFormat": "diff"
        },
        "queryDefinitions": [...],
        "template": {...}
    }
]

Query Definition and Version Template

Now, we will need to define: