Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the census_zip_codes_in_colorado_2018
table in this repository, by referencing it like:
"colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest"."census_zip_codes_in_colorado_2018"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"the_geom", -- GeoJSON field describing the boundaries of the area
"zip_code",
"pop", -- Population Estimate for the given time range
"hispanic", -- Estimate for the Hispanic Population
"white_nh", -- Estimate for the White, Non-Hispanic Population
"black_nh", -- Estimate for the Black, Non-Hispanic Population
"ntvam_nh", -- Estimate for the Native American, Non-Hispanic Population
"asian_nh", -- Estimate for the Asian, Non-Hispanic Population
"hawpi_nh", -- Estimate for the Hawaiian and Pacific Islander, Non-Hispanic Population
"other_nh", -- Estimate for the population of those identifying with any other race (Non-Hispanic)
"twoplus_nh", -- Estimate for the population who identifies with more than one race (Non-Hispanic)
"male", -- Estimate for the total male population
"female", -- Estimate for the total female population
"ageless5", -- Estimate for the total population less than 5 years of age (0 to 5)
"age5_9", -- Estimate for the total population from 5 to 9 years of age
"age10_14", -- Estimate for the total population from 10 to 14 years of age
"age15_19", -- Estimate for the total population from 15 to 19 years of age
"age20_24", -- Estimate for the total population from 20 to 24 years of age
"age25_29", -- Estimate for the total population from 25 to 29 years of age
"age30_34", -- Estimate for the total population from 30 to 34 years of age
"age35_39", -- Estimate for the total population from 35 to 39 years of age
"age40_44", -- Estimate for the total population from 40 to 45 years of age
"age45_49", -- Estimate for the total population from 45 to 49 years of age
"age50_54", -- Estimate for the total population from 50 to 54 years of age
"age55_59", -- Estimate for the total population from 55 to 59 years of age
"age60_64", -- Estimate for the total population from 60 to 64 years of age
"age65_69", -- Estimate for the total population from 65 to 69 years of age
"age70_74", -- Estimate for the total population from 70 to 74 years of age
"age75_79", -- Estimate for the total population from 75 to 79 years of age
"age80_84", -- Estimate for the total population from 80 to 84 years of age
"age85pl", -- Estimate for the total population of 85 years of age or older
"ageless18", -- Estimate for the total population less than 18 years of age (0 to 17)
"age18_24", -- Estimate for the total population from 18 to 24 years of age
"med_age", -- Median age of the population for the area
"households", -- Estimate for the total number of household
"familyhh", -- Estimate for the total number of family household
"nonfamhh", -- Estimate for the total number of nonfamily household
"hhldralone", -- Estimate for the total number of nonfamily households where householder lives alone
"hhldr_naln", -- Estimate for the total number of nonfamily households where householder does not live alone
"housing_un", -- Estimated total number of housing units
"occ_hu", -- Estimated number of occupied housing units
"vac_hu", -- Estimated number of vacant housing units
"owned", -- Estimated number of owner-occupied housing units
"rented", -- Estimated number of renter-occupied housing units
"pop25plus", -- Estimated total population 25 years or older (universe for educational attainment)
"nohsdipl", -- Estimated population 25 years or older that does not have a high school diploma or equivalent (GED) level of education
"hsgrad_sc", -- Estimated population 25 years or older that has a high school level of education (or equivalent) or higher education (some college) but did not achieve a bachelors degree or higher
"bachl_hghr", -- Estimated total population 25 years or older with a Bachelors Degree, Masters Degree, Phd or other Professional Degree
"med_hh_inc", -- Estimated median household income
"med_fam_in", -- Estimated median family income
"per_cap_in", -- Estimated per capita income
"med_yr_blt", -- median year structure built
"med_c_rent", -- Estimated median contract rent
"med_g_rent", -- Estimated median gross rent
"med_hm_val", -- Estimated median home value
"citz_birth", -- Estimated total population that is a US Citizen by birth
"citz_nat", -- Estimated total population that is a US Citizen by Naturalization
"not_citz", -- Estimated total population that is not a US Citizen
"born_in_co", -- Estimated total population that was born in Colorado
"brn_oth_st", -- Estimated total population that was born in a state other than Colorado
"ntv_b_o_us", -- Estimated total population that was born outside of the US (In a US Territory Abroad)
"foreign_b", -- Estimated total population that was born in a foreign country
"pop_1p", -- Estimated total population 1 year old or more
"same_house", -- Estimated total population living in the same house as they did 1 year ago
"same_cnty", -- Estimated total population that moved within the same county
"same_state", -- Estimated total population that moved to a different county within the same state (Colorado)
"diff_state", -- Estimated total population living in a state other than Colorado 1 year ago
"frm_abroad", -- Estimated total population living abroad 1 year ago
"wrkrs_16pl", -- Estimated total population age 16 years or older who worked in the past 12 months
"car_all", -- Estimated total working population 16+ who traveled to work by car
"car_alone", -- Estimated total working population 16+ who traveled to work by car alone
"car_carpoo", -- Estimated total working population 16+ who traveled to work by car with others
"public_trn", -- Estimated total working population 16+ who took public transportation to work
"pt_bus", -- Estimated total working population 16+ who rode a bus to work
"pt_other", -- Estimated total working population 16+ who used any other form of public transportation (other than a bus) to work
"bike", -- Estimated total working population 16+ who rode a bike to work
"walk", -- Estimated total working population 16+ who walked to work
"tr_other", -- Estimated total working population 16+ who used any other form of transportation not mentioned previously
"wrk_home", -- Estimated total working population 16+ who worked from home
"w_16pl_nh", -- Estimated total working population 16+ who commuted to work
"t_less_10", -- Estimated total working population 16+ whose commute time was less than 10 minutes
"t_10_19", -- Estimated total working population 16+ whose commute time was from 10 to 19 minutes
"t_20_29", -- Estimated total working population 16+ whose commute time was from 20 to 29 minutes
"t_30_39", -- Estimated total working population 16+ whose commute time was from 30 to 39 minutes
"t_40_59", -- Estimated total working population 16+ whose commute time was from 40 to 59 minutes
"t_60_pl", -- Estimated total working population 16+ whose commute time was 60 minutes or higher
"pop_3pl", -- Estimated total population 3 years old or higher. (Universe for school enrollment)
"enrolled", -- Estimated total population 3+ who are enrolled in school
"preschool", -- Estimated total population 3+ enrolled in preschool
"kndrgrtn", -- Estimated total population enrolled in kindergarten
"gr_1_4", -- Estimated total population enrolled in grades 1-4
"gr_5_8", -- Estimated total population enrolled in grades 5-8
"gr_9_12", -- Estimated total population enrolled in grades 9-12
"undergrad", -- Estimated total population enrolled in undergraduate studies
"grad_prof", -- Estimated total population enrolled in graduate studies
"n_enrolled", -- Population 3 yrs and older not enrolled in school
"blt_2010_p", -- Estimated number of housing units built since 2010
"b2000_2009", -- Estimated number of housing units built from 2000 to 2009
"b1990_1999", -- Estimated number of housing units built from 1990 to 1999
"b1980_1989", -- Estimated number of housing units built from 1980 to 1989
"b1970_1979", -- Estimated number of housing units built from 1970 to 1979
"b1960_1969", -- Estimated number of housing units built from 1960 to 1969
"b1950_1959", -- Estimated number of housing units built from 1950 to 1959
"b1940_1949", -- Estimated number of housing units built from 1940 to 1949
"b1939_e", -- Estimated number of housing units built in 1939 or earlier
"own_occ_hu", -- Estimated number of owner occupied housing units
"v_l_50k", -- Estimated number of owner occupied housing units where the home value is less than $50,000
"v50k_100k", -- Estimated number of owner occupied housing units where the home value is from $50,000 to $99,999
"v100k_150k", -- Estimated number of owner occupied housing units where the home value is from $100,000 to $149,999
"v150k_200k", -- Estimated number of owner occupied housing units where the home value is from $150,000 to $199,999
"v200k_250k", -- Estimated number of owner occupied housing units where the home value is from $200,000 to $249,999
"v250k_300k", -- Estimated number of owner occupied housing units where the home value is from $250,000 to $299,999
"v300k_400k", -- Estimated number of owner occupied housing units where the home value is from $300,000 to $399,999
"v400k_500k", -- Estimated number of owner occupied housing units where the home value is from $400,000 to $499,999
"v500k_750k", -- Estimated number of owner occupied housing units where the home value is from $500,000 to $749,999
"v750k_1m", -- Estimated number of owner occupied housing units where the home value is from $750,000 to $999,999
"v_1m_plus", -- Estimated number of owner occupied housing units where the home value is $1,000,000 or higher
"rnt_occ_hu", -- Estimated number of renter occupied housing units
"rntl400", -- Estimated number of renter occupied housing units where the gross rent is less than $400
"r400t599", -- Estimated number of renter occupied housing units where the gross rent is from $400 to $599
"r600t799", -- Estimated number of renter occupied housing units where the gross rent is from $600 to $799
"r800t999", -- Estimated number of renter occupied housing units where the gross rent is from $800 to $999
"r1000t1249", -- Estimated number of renter occupied housing units where the gross rent is from $1,000 to $1,249
"r1250t1499", -- Estimated number of renter occupied housing units where the gross rent is from $1,250 to $1,499
"r1500t1999", -- Estimated number of renter occupied housing units where the gross rent is from $1,500 to $1,999
"r2000pl", -- Estimated number of renter occupied housing units where the gross rent is greater than $2000
"rnocshr", -- Estimated number of renter occupied housing units where the householder pays no cash rent
"ps_uni", -- Estimated total population for whom poverty status is determined
"ps_below", -- Estimated population for whom poverty status is below the poverty level
"ps_bel150", -- Estimated population for whom poverty status is below 150% the poverty level
"tot_l18", -- Estimated population who is under 18 years of age for whom poverty status is determined. [B17001_048
"puni_65pl", -- Estimated population who is over 65 years of age for whom poverty status is determined
"pov_l18", -- Estimated population who is under 18 years of age where the status is below the poverty line
"pov_65pl", -- Estimated population who is over 65 years of age where the status is below the poverty line
"avghhsize", -- Estimated average household size of occupied housing units
"hhi_l20k", -- Estimated number of households whose income in the past 12 months is less than $20,000
"hhi20_30", -- Estimated number of households whose income in the past 12 months is from $20,000 and $29,999
"hhi30_40", -- Estimated number of households whose income in the past 12 months is from $30,000 and $39,999
"hhi40_50", -- Estimated number of households whose income in the past 12 months is between $40,000 and $49,999
"hhi50_60", -- Estimated number of households whose income in the past 12 months is between $50,000 and $59,999
"hhi60_75", -- Estimated number of households whose income in the past 12 months is between $60,000 and $74,999
"hhi75_100", -- Estimated number of households whose income in the past 12 months is between $75,000 and $99,999
"hhi100_125", -- Estimated number of households whose income in the past 12 months is between $100,000 and $124,999
"hhi125_150", -- Estimated number of households whose income in the past 12 months is between $125,000 and $149,999
"hhi150_200", -- Estimated number of households whose income in the past 12 months is between $150,000 and $199,999
"hhi200_pl", -- Estimated number of households whose income in the past 12 months is $200,000 or more
"civ_ni_pop", -- Estimated total population of civilians not institutionalized (disabled and not disabled)
"disabled", -- Estimated population of civilian non-institutionalized that are disabled
"pop16_pls", -- Estimated population 16 years and over
"laborforce", -- Estimated number of people in the labor force
"civ_lf", -- Estimated number of people in the civilian labor force
"emp", -- Estimated number of people in the civilian labor force that are employed
"unemp", -- Estimated number of people in the civilian labor force that are not employed
"armedfrcs", -- Estimated number of people in the armed forces
"not_lf", -- Estimated number of people not in the labor force
"civ_ni_p" -- Estimated total civilian non-institutionalized population
FROM
"colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest"."census_zip_codes_in_colorado_2018"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest
This will download all the objects for the latest
tag of colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, colorado-gov/census-zip-codes-in-colorado-2018-iuxm-ddzz
is just another Postgres schema.