Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the small_area_profile_neighborhood_level_2023
table in this repository, by referencing it like:
"sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest"."small_area_profile_neighborhood_level_2023"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"number_of_tobacco_retail", -- Number of tobacco retail outlets per square mile
"average_distance_miles_to_1", -- Average distance (in miles) to the nearest park or open space
"influenza_and_pneumonia_deaths", -- Influenza and pneumonia death rate per 100,000 people
"homicide_deaths_per_100_000", -- Homicide rate per 100,000 people
"drug_overdose_deaths_per", -- Drug overdose death rate per 100,000 people
"chronic_lower_respiratory", -- Chronic lower respiratory disease death rate per 100,000 people
"diabetes_deaths_per_100_000", -- Diabetes death rate per 100,000 people
"stroke_deaths_per_100_000", -- Stroke death rate per 100,000 people
"public_transportation", -- Percent of population ages 16 years and older who used public transit to commute to work
"bicycle", -- Percent of population ages 16 years and older who rode a bicycle to work
"walk", -- Percent of population ages 16 years and older who walked to work
"work_from_home", -- Percent of population ages 16 years and older who worked from home
"travel_time_to_work_over", -- Percent of population ages 16 years and older whose work commute time is more than 30 minutes
"outdoor_occupation_workers", -- Percent of population ages 16 years and older who work in outdoor occupations
"households_receiving_calfresh", -- Percent of households which received Cal-Fresh benefits in the past 12 months
"vacant_housing_units", -- Count of vacant housing units
"multi_unit_housing", -- Percent of housing units that are multiunit structures
"heart_disease_deaths_per", -- Heart disease death rate per 100,000 people
"cancer_deaths_per_100_000", -- Cancer death rate per 100,000 people
"all_cause_deaths_per_100", -- All-cause death rate per 100,000 people
"renter_households", -- Percent of renter occupied households
"teen_live_births_per_1_000", -- Teen live birth rate per 1,000 females ages 15 to 19 years
"households_with_gross_rent", -- Percent of renter-occupied households with gross rent accounting for 30% or more of household income
"mothers_who_received_early", -- Percent of mothers who received early and adequate prenatal care during pregnancy
"overweight_or_obese_in_1st", -- Percent of mothers who were overweight or obese in the 1st trimester of pregnancy
"home_ownership_rate", -- Home ownership rate
"preterm_births", -- Preterm birth rate per 100 live births
"homeowner_cost_burden", -- Percent of owner-occupied households with monthly expenses accounting for 30% or more of household income
"housing_units_build_before", -- Percent of housing units built before 1980
"low_birth_weight_infants", -- Low birth weight rate per 100 live births
"housing_units_heated_by_gas", -- Percent of housing units that were heated by gas fuel source
"births_per_1_000_people", -- Birth rate per 1,000 people
"covid_booster_vaccination", -- COVID-19 booster vaccination rate
"covid_complete_vaccination", -- COVID-19 complete vaccination rate
"covid_test_rate_per_100_000", -- COVID-19 test rate per 100,000 people
"_5_yr_average_of_early", -- Early syphilis new diagnosis rate per 100,000 people (5-year average)
"gonorrhea_new_diagnosis_per", -- Gonorrhea new diagnosis rate per 100,000 people
"chlamydia_new_diagnosis_per", -- Chlamydia new diagnosis rate per 100,000 people
"carpooled", -- Percent of population ages 16 years and older who carpooled to work
"average_distance_miles_to", -- Average distance (in miles) to the nearest farmers' market
"housing_units_without_complete", -- Percent of housing units without complete plumbing facilities
"covid_case_rate_per_100_000", -- COVID-19 case rate per 100,000 people
"housing_units_without_internet", -- Percent of housing units without internet access
"alzheimer_s_disease_deaths", -- Alzheimer's disease death rate per 100,000 people
"_5_yr_average_of_hiv_new", -- HIV new diagnosis rate per 100,000 people (5-year average)
"work_impairment_18", -- Percent of adults who reported emotions interfered with work performance in the past 12 months
"visited_dentist_18", -- Percent of adults who reported visiting a dentist in the past year
"drove_alone", -- Percent of population ages 16 years and older who drove alone to work
"uninsured_dental_insurance", -- Percent of adults who reported not having dental insurance
"population_that_resides_within", -- Percent of population that resides within 1/2 mile of transit stop with a headway of 15 minutes or less during peak commute hours
"household_with_vehicle_access", -- Percent of households with vehicle access
"teeth_in_fair_poor_condition", -- Percent of adults who reported teeth in fair/poor health
"number_of_motor_vehicle", -- Average annual count of motor vehicle collisions
"sugar_drink_consumption_1", -- Percent of adults who reported drinking a soda or sugar-sweetened beverage at least one time in a day
"number_of_vehicle_bicycle", -- Average annual count of vehicle-bicycle injury collisions
"serious_psychological_distress", -- Percent of adults who reported serious psychological distress in the past 12 months (using Kessler 6 series and a score of more than or equal to 13)
"number_of_vehicle_pedestrian", -- Average annual count of vehicle-pedestrian injury collisions
"college_graduate_or_higher", -- Percent of adults ages 25 years and older with bachelor's degree or higher education attainment
"some_college_or_associate", -- Percent of adults ages 25 years and older with some college education attainment or an associate's degree
"secondhand_smoke_18", -- Percent of adults who reported being exposed to secondhand smoke or e-cigarette vapor in the past 2 weeks
"high_school_graduate", -- Percent of adults ages 25 years and older with a high school diploma
"obese_bmi_30_18", -- Percent of adults who were obese
"less_than_high_school", -- Percent of adults ages 25 years and older with less than high school education attainment
"needed_help_for_mental_health", -- Percent of adults who reported needing help for emotional/mental or alcohol/drug problem in the past 12 months
"flavored_tobacco_18", -- Percent of adults who reported using a flavored tobacco product in the past 30 days
"fair_or_poor_health_65", -- Percent of adults ages 65 years and older with fair or poor health
"fair_or_poor_health_18_64", -- Percent of adults ages 18-64 years with fair or poor health
"e_cigarette_user_18", -- Percent of adults who reported using an e-cigarette in the past 30 days
"high_school_enrollment_15", -- Percent of children ages 15-17 years enrolled in a high school
"children_ages_3_4_enrolled", -- Percent of children ages 3-4 years enrolled in a preschool or a nursery school
"total_population_200_fpl", -- Percent of total population living below 200% of Federal Poverty Level
"children_ages_0_17_below", -- Percent of children ages less than 18 years living below 200% of Federal Poverty Level
"families_below_200_fpl", -- Percent of families living below 200% of Federal Poverty Level
"households_without_full_time", -- Percent of households without a full-time year-around employment
"employed_25_64_years", -- Percent of civilian population ages 25-64 with employment
"children_living_in_households", -- Percent of children living in households that received assistance in the past 12 months
"median_household_income", -- Median household income
"per_capital_income", -- Per Capita Income
"u_s_citizenship", -- Percent of foreign-born population with U.S. citizenship
"elderly_living_alone", -- Percent of population 65 years and older living alone
"households_with_children", -- Percent of households with children under 18 years of age
"delayed_prescriptions_medical", -- Percent of adults who reported delaying or not getting needed prescription drugs or medical services in the past 12 months
"single_parent_households", -- Percent of households with a single parent
"speaks_a_language_other_than", -- Percent of population ages 5 and older who speaks a language other than English at home
"current_smoker_18", -- Percent of adults who are current smokers
"foreign_born", -- Percent of foreign-born population
"cognitive_disability", -- Percent of population living with a cognitive disability
"people_living_in_group_quarter", -- Count of population living in group quarters
"ambulatory_disability", -- Percent of population living with an ambulatory disability
"bipoc_black_indigenous_and", -- Percent of Black, Indigenous and People of Color population
"latino", -- Percent of Latino population
"african_american", -- Percent of African American, non-Hispanic population
"white", -- Percent of White, non-Hispanic population
"ages_65_and_older", -- Percent of population ages 65years and older
"_55_64_years", -- Percent of population ages 55 to 64 years
"_35_44_years", -- Percent of population ages 35 to 44 years
"number_of_alcohol_retail", -- Number of alcohol retail outlets per square mile
"_25_34_years", -- Percent of population ages 25 to 34 years
"_18_24_years", -- Percent of population ages 18 to 24 years
"_10_14_years", -- Percent of population ages 10 to 14 years
"_5_9_years", -- Percent of population ages 5 to 9 years
"neighborhood", -- Geography name
"speaks_english_less_than", -- Percent of population ages 5 and older who speak English 'less than well'
"hypertension_deaths_per_100", -- Hypertension (high blood pressure) death rate per 100,000 people
"ever_diagnosed_with_diabetes", -- Percent of adults who reported ever diagnosed with diabetes by a healthcare professional
"suicide_deaths_per_100_000", -- Suicide rate per 100,000 people
"male", -- Percent of male population
"overcrowded_households", -- Percent of overcrowded households
"_5_yr_average_of_tb_new", -- Tuberculosis (TB) new diagnosis rate per 100,000 people (5-year average)
"liguistically_isolated", -- Percent of linguistically isolated households
"_15_17_years", -- Percent of population ages 15 to 17 years
"_45_54_years", -- Percent of population ages 45 to 54 years
"average_household_size", -- Average household size
"other_means", -- Percent of population ages 16 years and older who used some other means to commute to work
"firearm_deaths_per_100_000", -- Firearm death rate per 100,000 people
"_0_4_years", -- Percent of population ages less than 5 years
"unintentional_injury_deaths", -- Unintentional injury death rate per 100,000 people
"population_size", -- Count of total population
"healthcare_coverage_age_limit", -- Percent of population ages 19 to 64 years with healthcare insurance
"housing_units", -- Count of housing units
"female", -- Percent of female population
"pacific_islander", -- Percent of Native Hawaiian and Pacific Islander, non-Hispanic population
"ever_diagnosed_with_heart", -- Percent of adults who reported ever diagnosed with a heart disease by a healthcare professional
"family_life_impairment_during", -- Percent of adults who reported emotions interfered relationship with friends and family in the past 12 months
"asian" -- Percent of Asian, non-Hispanic population
FROM
"sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest"."small_area_profile_neighborhood_level_2023"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest
This will download all the objects for the latest
tag of sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, sccgov/small-area-profile-neighborhood-level-2023-mykk-z8mk
is just another Postgres schema.