dbt
NOTE: This is an sgr
-specific article. To learn how to use Splitgraph with
dbt, see the corresponding Splitgraph page.
The recommended way of building Splitgraph data images is Splitfiles that offer Dockerfile-like caching, provenance tracking and efficient rebuilds.
However, there are plenty of other great tools for building datasets and, as
long as they work with PostgreSQL, they too can benefit from sgr
's data
versioning, packaging and sharing capabilities.
One such tool is dbt that assembles data transformations from small building blocks, decreasing the amount of boilerplate. In a sense, dbt can be used as an advanced SQL templating engine.
Turning the source and the target schemas that dbt uses into Splitgraph repositories opens up a lot of opportunities:
- No need to run development and production dbt models against the same warehouse. A Splitgraph image can be cloned and checked out on the development engine.
- Performing what-if analyses becomes simple by switching between different
versions of the source dataset and comparing the resultant images with
sgr diff
. - Built datasets can be pushed to other
sgr
engines, shared publicly or serve as inputs to a pipeline of Splitfiles. - Input datasets can leverage
sgr
's layered querying, allowing dbt to seamlessly query huge datasets with a limited amount of local disk space. - Input datasets can be backed by foreign data wrappers, allowing dbt to directly use data from a wide variety of databases without having to write an extra ETL job to load the data into the warehouse.
sgr
dbt adapter
You don't need any extra plugins to use dbt with sgr
, since you can use dbt's
native PostgreSQL support to query the sgr
engine. However, if you install the
sgr
dbt adapter,
you will be able to reference sgr
images directly from your dbt code. For
example:
{{ config(materialized='table') }}
with source_data as (
select domain, count(domain) as count
from "splitgraph/socrata:latest".datasets
group by domain
)
select *
from source_data
See
our GitHub page
for instructions on how to install and use the sgr
dbt adapter as well as a
sample dbt project.
Example
The dbt example
showcases running dbt against the sgr
engine, using sgr
to swap between
different versions of the source dataset and looking at their effect on the
built dbt model.