- Navigate to Settings > Data Ingestion > Data Sources.
- Click New data source.
- Select your system and the newly created connection.
- Name your source (and add a description) for easier identification in Dawiso.
Configure scanner options
In this step, choose which categories of Airflow metadata to ingest. All options are enabled by default — disable individual categories only if you have a specific reason to skip them.
| Setting | Default | Description |
|---|---|---|
| Scan Datasets | Enabled | Extract dataset (Airflow 2.4+) or asset (Airflow 3.x) metadata, including producer-task and consumer-DAG relations. Disable for Airflow 2.0 – 2.3 deployments that do not expose datasets. |
| Scan Connections | Enabled | Extract connection metadata (type, host, port, schema, login). Connection passwords are never extracted. |
| Scan Variables | Enabled | Extract variable keys and descriptions. Variable values are never extracted. |
| Scan Pools | Enabled | Extract resource pool configurations and slot counts. |
| Include Paused DAGs | Enabled | Include DAGs that are currently paused. Disable to scan only active DAGs. |
The scanner extracts task dependencies (upstream and downstream relations within a DAG) and dataset lineage automatically when Scan Datasets is enabled. These relations appear in the Dawiso lineage diagram as Data Source edges.
Editing the scanner options
If you change these options on an existing data source, note the following:
- Categories you newly enable are picked up during the next ingestion run.
- Categories you disable will have their existing metadata treated as deleted. Any user-added content (descriptions, ownership, governance attributes) on those object types is removed during the next run.
Destination configuration
In the Destination configuration step:
- Select into what space you want to store your ingested metadata. You can select only the spaces you have access to.
- Select the workflow for ingested objects management. For more information on workflows, see the article on Workflow types.
- [Optional steps]
- Schedule: To customize the regular automated ingestion, check the box next to Schedule and adjust the frequency.
- Optional Settings: Specify options supported by the provider in JSON format. Option will be made available based on feedback. For more information, see Optional Settings. |
- Additional Settings: Add a JSON formatted list of additional settings that can be used to troubleshoot your custom scenarios. We recommend leaving this field empty unless our Dawiso support team suggests otherwise.
- Enable AI-Generated Descriptions: Enable AI-generated descriptions for your ingested metadata. Descriptions will be generated according to prompts that are defined in packages. Each attribute type can have its own configurable prompt and any AI-generated content will be clearly marked with a banner the text is manually reviewed and saved.
- Descriptions will be generated on the following levels: database, schema, view, and table.

- Descriptions will be generated on the following levels: database, schema, view, and table.
- Save your data source.
If you delete a data source and then create a new one to the same space, the original ingested data will remain and mix with the new data. To prevent duplicates, always delete the metadata application from the space (with the old source) before creating and ingesting a new one.
Run ingestion
Once you create your data source, you will be redirected to a list of your data sources.
- Click the three dots next to your newly created source, and select Run ingestion.
- Confirm to ingest your data.
You can find all new, running, or completed ingestions by navigating to Settings > Data Ingestion > Ingestions.
For more information, see the article on Ingestions.
If you are ingesting data on-premises, refer to these articles instead Dawiso Integration Runtime (DIR).
Once data is ingested to metadata applications, manual changes to the object hierarchy are NOT allowed. Changes such as adding, deleting, or moving objects under different parents may cause data inconsistencies and break data scans.
