Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the state_of_delaware_american_rescue_plan_arp
table in this repository, by referencing it like:
"delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest"."state_of_delaware_american_rescue_plan_arp"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"deptid_descr", -- The description of the chartfield that tracks administrative units within the State’s organizational hierarchy. A DeptID must have an ongoing business purpose, contains positioned employees, has a permanent budget, and is assigned physical space.
"op_unit_descr", -- The description of the eight digit ChartField consisting of a budget, location and manager.
"fund", -- A self-balancing ChartField used in government to segregate different fund types.
"approp", -- A five digit ChartField established to record specific authorizations to spend
"approp_descr", -- The description of the five digit ChartField established to record specific authorizations to spend
"account", -- A five digit chartfield used to identify asset, liabilities, fund balance, revenue or expenses. Project Costing: Classifies the nature of a transaction.
"account_descr", -- The description of the five digit chartfield used to identify asset, liabilities, fund balance, revenue or expenses. Project Costing: Classifies the nature of a transaction.
"program", -- A five digit ChartField used by the organizations to track revenues and or expenditures.
"program_descr", -- The Description of the five digit ChartField used by the organizations to track revenues and or expenditures.
"sch_code", -- A six digit code used to identify physical pupil based expenses
"sch_code_descr", -- The description of the six digit code used to identify physical pupil based expenses
"pc_bu", -- PCBU’s are used to functionally organize (and segregate) the use of PC by state organizations. Each Agency and School District within the state which receives a grant or which chooses to use Project Costing to support capital projects is assigned its own PCBU. Grants: Grants module is also using PC Bus Units to group projects under different major Divisions.
"project", -- A cost center established by an agency; usually either for a Grant or Capital Project or for a DelDOT Operating Project. Projects are associated with an agency’s PCBU. Once the cost center is established, Activities may be associated with the cost center and budgets can be attached at the activity level.Grants: Projects provide the structure to which activities and resources are added. Projects can contain activities and resources. This provides a hierarchical relationship between projects and facilitates cost roll ups. A project in First State Financial Grants is a subset of a proposal (to handle the post-award activities); proposals may contain or entail multiple projects.
"project_descr", -- The description of the cost center established by an agency; usually either for a Grant or Capital Project or for a DelDOT Operating Project. Projects are associated with an agency’s PCBU. Once the cost center is established, Activities may be associated with the cost center and budgets can be attached at the activity level.Grants: Projects provide the structure to which activities and resources are added. Projects can contain activities and resources. This provides a hierarchical relationship between projects and facilitates cost roll ups. A project in First State Financial Grants is a subset of a proposal (to handle the post-award activities); proposals may contain or entail multiple projects.
"activity", -- Activities are sub-components of Projects. Project Budgets are established at this level. Ex: The Project “Build New Elementary School” has two activities associated with it: Design and Construction. Budgets are defined at the Design and Construction activity levels.Grants: Activities are the specific tasks that make up a project. You can add transactions to a project only at the activity level. The Grants module uses Project Activity to represent federal reporting categories.
"check_number", -- The payment reference number. This includes checks and ACH payments.
"check_date", -- The date of the payment.
"fiscal_period", -- The fiscal period of the expenditure. Each fiscal period corresponds to a month within the fiscal year. For example, 1 = July, 2 = August, 3 = September, 7 = January, 12 = June.
"fund_type", -- The type of fund where the expenditure is recorded.
"budget_ref", -- Four (4) digit ChartField used to identify the year in which the budget was funded.
"pc_bu_descr", -- The description of the PCBU’s that are used to functionally organize (and segregate) the use of PC by state organizations. Each Agency and School District within the state which receives a grant or which chooses to use Project Costing to support capital projects is assigned its own PCBU. Grants: Grants module is also using PC Bus Units to group projects under different major Divisions.
"activity_descr", -- The description of the activities that are sub-components of Projects. Project Budgets are established at this level. Ex: The Project “Build New Elementary School” has two activities associated with it: Design and Construction. Budgets are defined at the Design and Construction activity levels.Grants: Activities are the specific tasks that make up a project. You can add transactions to a project only at the activity level. The Grants module uses Project Activity to represent federal reporting categories.
"amount", -- The amount of the payment.
"fiscal_year", -- The state fiscal year of the expenditure.
"department", -- The state organization where the expense is recognized.
"division", -- The division within the state organization where the expense is recognized.
"vendor", -- The name of the business or person that received payment from the state organization.
"fund_descr", -- The description of the self-balancing ChartField used in government to segregate different fund types.
"deptid", -- Chartfield that tracks administrative units within the State’s organizational hierarchy. A DeptID must have an ongoing business purpose, contains positioned employees, has a permanent budget, and is assigned physical space.
"op_unit" -- An eight digit ChartField consisting of a budget, location and manager.
FROM
"delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest"."state_of_delaware_american_rescue_plan_arp"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest
This will download all the objects for the latest
tag of delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, delaware-gov/state-of-delaware-american-rescue-plan-arp-e2rw-zi3s
is just another Postgres schema.