Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the rsbs_mom_multifamily_onsite_inspections_site_level
table in this repository, by referencing it like:
"ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest"."rsbs_mom_multifamily_onsite_inspections_site_level"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"air_duct_fireplace_without_damper", -- Number of air duct fireplaces without a damper [e.g., 0 – 10]
"dhw_solar_equipment_system", -- Presence of installed DHW solar equipment system(s) during on-site visit [e.g., No – Yes]
"average_ceiling_height", -- The average height of the ceiling within the building measure in feet [e.g., 8 – 12]
"roof_color", -- Color of the roof located onsite [e.g., Dark – Medium]
"dhw_solar_equipment_system_collector_technology", -- Type of technology used as DHW solar equipment system collector [Flat Plate – Not Applicable]
"capacity_of_dhw_solar_equipment_system_measured_in_gallons", -- Measured capacity of installed DHW solar equipment system [e.g., 0-30]
"occupied_conditioned_space_in_summer", -- Percentage of occupied conditioned space in summer [e.g., 100 – Not Recorded]
"total_units_per_building", -- Total number of units per each building located onsite [e.g., 1 – 700]
"is_the_dhw_solar_equipment_system_in_working_order", -- Installed DHW solar equipment system in working condition [e.g., Not Applicable – Yes]
"percent_conditioned_common_area", -- Percentage of conditioned space located in the common area [e.g., 0 – 100]
"dhw_solar_equipment_system_type", -- Type of DHW solar equipment system installed at time of on-site visit [e.g., Closed – Not Applicable]
"connection_boundary_wall", -- Quality of connection between building envelope and boundary wall [e.g., N/A Not Attached – Well Sealed]
"connection_duct", -- Quality of connection between building envelope and duct [e.g., N/A Not Attached – No Ducts in Garage]
"no_bedrooms", -- Total number of bedrooms located onsite [e.g., 1 – 1320]
"climate_zone", -- IECC climate Zone [e.g., Climate Zone 4-6]; see https://energycode.pnl.gov/EnergyCodeReqs/?state=New%20York for a map and details for each zone
"site_id", -- Anonymous identifier applied to the multifamily site for the on-site multifamily inspection
"building_unit_laundry", -- Presence of individual unit laundry or common area building laundry [e.g., Both – Unit]
"connection_ceiling", -- Quality of connection between building envelope and ceiling [e.g., N/A Not Attached – Well Sealed]
"occupied_conditioned_space_in_winter", -- Percentage of occupied conditioned space in winter [e.g., 100 – Not recorded]
"shielding_of_home", -- Type of home shielding for the onsite building [e.g., Exposed – Well Shielded]
"siding_condition", -- Condition of the onsite building siding [e.g., Fair – Poor]
"garage_finished_interior", -- Presence of a finished interior in the garage [e.g., Concrete – Unfinished Wall]
"basement_type", -- Type of basement in located onsite [e.g., Conditioned Basement – Unconditioned Basement]
"participant_type", -- The title of the person who aided the on-site inspector with their inspection [e.g. Manager/Other-Owner]
"air_duct_piping_insulation_thickness_inches", -- The thickness of the insulation used on the air ducts, measured in inches [e.g. 0-3]
"air_duct_piping_system_served", -- The type of system served by the air duct piping installed on-site [e.g., Cooling – No Distribution Present]
"building_home_type", -- Type of multifamily building [e.g., 2-4 Unit Bldg – Multifamily (5+)]
"space_heating_type", -- Type of space heating system located onsite [e.g., District Steam – Natural Gas Water Heaters which convert the heat to forced hot air]
"no_occupants", -- Total number of occupants located onsite [e.g., 1 – 1700]
"style_of_home", -- Type or style of the onsite building [e.g., 2 story Multifamily garden apartments – Townhouse]
"gas_provider", -- Gas utility provider for the multifamily building(s) [e.g., ConEd-UGI]
"electric_provider", -- Electric utility [e.g., Central Hudson-RG&E]
"construction_type", -- Flag indicating construction type [e.g., Existing-New]
"county", -- County in New York where the multifamily building is located [e.g., Albany-Yates]
"multifamily_owner_manager_mom_survey_id", -- Anonymous unique identifier that links the on-site inspection data to the telephone survey with multifamily owners and managers. [0-400]
"siding_material", -- Type of material used for the siding [e.g., Asbestos – Wood]
"roof_condition", -- Condition of the onsite roof [e.g., Covered in Snow – Poor]
"region", -- Region in New York [e.g., Capital District-Western New York]
"air_duct_fireplace_with_damper", -- Number of air duct fireplaces with a damper [e.g., 0 – 10]
"significant_structural_components", -- Significant structural components located within the building [e.g., Not Recorded – Vertical Chases]
"building_unit_ventilation", -- Presence of individual unit ventilation or building wide ventilation system [e.g., Both – Unit]
"energy_management_systems_description", -- Description of energy management system from onsite visit [e.g., Boiler controlled by outside temperature sensors – Tekmar Mixing Controls]
"conditioned_floor_space_measured_sqft", -- Square feet of conditioned floor space located onsite [e.g., 1 – 1425200]
"total_units_on_site", -- Total number of units from all buildings onsite [e.g., 5 – 1100]
"no_studio_apartments", -- Total number of studio apartments identified during on-site visit [e.g., 0-300]
"percent_of_total_air_duct_piping_for_distribution_located_in_un", -- Percent of air duct piping which is insulated in an unconditioned section of the site [e.g. 0-100]
"no_1_bedroom_apartments", -- Total number of one bedroom apartments identified during on-site visit [e.g., 0- 600]
"no_2_bedroom_apartments", -- Total number of two bedroom apartments identified during on-site visit [e.g., 0- 600]
"total_buildings_on_site", -- Total number of building from all building located onsite [e.g., 1 – 100]
"multifamily_building_type", -- Type of multifamily building [e.g., High Rise – Low Rise (3 stories or less)]
"percent_unit_vacancy", -- Percentage of vacant units located onsite [e.g., 0 – 100]
"no_of_floors_in_building", -- Total number of floors in the multifamily building [e.g., 1 – 40]
"common_area_amenities", -- Amenities located in the common area of the building [e.g., Clubhouses, Pool Area, Health Club, Tennis Courts - Tennis Courts, Pool Area]
"building_unit_heating", -- Presence of individual unit heating system of building wide heating system [e.g., Both – Unit]
"no_4_bedroom_apartments", -- Total number of four bedroom apartments identified during on-site visit [e.g., 0-20]
"building_unit_cooling", -- Presence of individual unit cooling system or building wide cooling system [e.g., Both – Unit]
"no_5_bedroom_apartments", -- Total number of five bedroom apartment identified during onsite visit [e.g., 0-5]
"water_heating_type", -- Type of water heating system located onsite [e.g., District Steam Storage/ Tank Water Heater – Natural Gas Tank (instant)]
"building_garage_type", -- Type of garage located onsite [e.g., Attached – Free Standing]
"window_shading", -- Presence of window shading [e.g., Half of Windows Shaded – Windows Partially Shaded]
"energy_management_systems", -- Presence of energy management systems during onsite visit [e.g., No – Yes]
"no_stories_above_grade", -- Onsite building number of stories above grade [e.g., 0 – 200]
"year_built", -- Year that building was built [e.g., 1836 – 2014]
"no_3_bedroom_apartments", -- Total number of three bedroom apartments identified during on-site visit [e.g., 0- 30]
"siding_color", -- Color of the onsite building siding [e.g., Dark – Medium]
"roof_material" -- Onsite building roof material [e.g., Asphalt – Tile]
FROM
"ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest"."rsbs_mom_multifamily_onsite_inspections_site_level"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest
This will download all the objects for the latest
tag of ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, ny-gov/rsbs-mom-multifamily-onsite-inspections-site-level-6vh9-pjsr
is just another Postgres schema.