Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the census_core_based_statistical_area_in_colorado
table in this repository, by referencing it like:
"colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest"."census_core_based_statistical_area_in_colorado"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"the_geom", -- GeoJSON field describing the boundaries of the area
"objectid", -- The increasing digit unique id for the designated area
"csafp",
"cbsafp",
"geoid", -- Comma delimited list (no quotes, leading zeros are necessary) of geoid examples: '08' is Colorado, '08031' is Denver County , '08031000701' is a Census Tract in Denver County, and '080010078011' is a block group in Adams County, and '0668154' is the city of San Luis Obispo, CA. You can also use the integer format for geoid called geonum by prefixing the geoid with ‘1’. San Luis Obispo would then be 10668154.
"name", -- The name of the region; could be a shortened ID for the designated area (eg 9638) or the county/subdivision name (eg Somerset)
"namelsad", -- The name of the designated area (eg Block Group 3)
"lsad",
"memi",
"mtfcc", -- The MAF/TIGER Feature Class Code (MTFCC) is a 5-digit code assigned by the Census Bureau intended to classify and describe geographic objects or features. The lookup table can be found at the link: https://www.census.gov/geo/reference/mtfcc.html
"aland", -- The amount of land in the designated area, in ft2.
"awater", -- The amount of water in the designated area, in ft2.
"intptlat", -- The center of the latitude of the designated area.
"intptlon", -- The center of the longitude of the designated area.
"shape_leng", -- The shape perimeter length, in miles
"shape_area", -- The shape area, in square miles
"logrecno", -- The Logical Record Number, a unique identifier
"nameacs", -- The full name of the designated area (eg Census Tract 9638, Gunnison County, Colorado)
"cbsa", -- The code for the designated area (eg 22660)
"geoid_1", -- Comma delimited list (no quotes, leading zeros are necessary) of geoid examples: '08' is Colorado, '08031' is Denver County , '08031000701' is a Census Tract in Denver County, and '080010078011' is a block group in Adams County, and '0668154' is the city of San Luis Obispo, CA. You can also use the integer format for geoid called geonum by prefixing the geoid with ‘1’. San Luis Obispo would then be 10668154.
"pop0711", -- The number of people living in the designated area for the given year.
"hispanic", -- Estimate for the Hispanic Population
"white_nh", -- Estimate for the White, Non-Hispanic Population
"black_nh", -- Estimate for the Black, Non-Hispanic Population
"ntvam_nh", -- Estimate for the Native American, Non-Hispanic Population
"asian_nh", -- Estimate for the Asian, Non-Hispanic Population
"hawpi_nh", -- Estimate for the Hawaiian and Pacific Islander, Non-Hispanic Population
"other_nh", -- Estimate for the population of those identifying with any other race (Non-Hispanic)
"twoplus_nh", -- Estimate for the population who identifies with more than one race (Non-Hispanic)
"male", -- Estimate for the total male population
"female", -- Estimate for the total female population
"ageless10", -- The number of people younger than 10
"ageless18", -- Estimate for the total population less than 18 years of age (0 to 17)
"age18_24", -- Estimate for the total population from 18 to 24 years of age
"age25_34", -- Estimate for the total population from 25 to 34 years of age
"age35_44", -- Estimate for the total population from 35 to 44 years of age
"age45_64", -- Estimate for the total population from 45 to 64 years of age
"age65plus", -- Estimate for the total population 65 years of age or older
"med_age", -- Median age of the population for the area
"households", -- Estimate for the total number of household
"familyhh", -- Estimate for the total number of family household
"nonfamhh", -- Estimate for the total number of nonfamily household
"hhldralone", -- Estimate for the total number of nonfamily households where householder lives alone
"hhldr_naln", -- Estimate for the total number of nonfamily households where householder does not live alone
"housing_un", -- Estimated total number of housing units
"occ_hu", -- Estimated number of occupied housing units
"vac_hu", -- Estimated number of vacant housing units
"owned", -- Estimated number of owner-occupied housing units
"rented", -- Estimated number of renter-occupied housing units
"pop25plus", -- Estimated total population 25 years or older (universe for educational attainment)
"nohsdipl", -- Estimated population 25 years or older that does not have a high school diploma or equivalent (GED) level of education
"hsgrad_sc", -- Estimated population 25 years or older that has a high school level of education (or equivalent) or higher education (some college) but did not achieve a bachelors degree or higher
"bachl_hghr", -- Estimated total population 25 years or older with a Bachelors Degree, Masters Degree, Phd or other Professional Degree
"med_hh_inc", -- Estimated median household income
"med_fam_in", -- Estimated median family income
"per_cap_in", -- Estimated per capita income
"med_yr_blt", -- median year structure built
"med_c_rent", -- Estimated median contract rent
"med_g_rent", -- Estimated median gross rent
"med_hm_val", -- Estimated median home value
"citz_birth", -- Estimated total population that is a US Citizen by birth
"citz_nat", -- Estimated total population that is a US Citizen by Naturalization
"not_citz", -- Estimated total population that is not a US Citizen
"born_in_co", -- Estimated total population that was born in Colorado
"brn_oth_st", -- Estimated total population that was born in a state other than Colorado
"ntv_b_o_us", -- Estimated total population that was born outside of the US (In a US Territory Abroad)
"foreign_b", -- Estimated total population that was born in a foreign country
"pop_1p", -- Estimated total population 1 year old or more
"same_house", -- Estimated total population living in the same house as they did 1 year ago
"same_cnty", -- Estimated total population that moved within the same county
"same_state", -- Estimated total population that moved to a different county within the same state (Colorado)
"diff_state", -- Estimated total population living in a state other than Colorado 1 year ago
"frm_abroad", -- Estimated total population living abroad 1 year ago
"wrkrs_16pl", -- Estimated total population age 16 years or older who worked in the past 12 months
"car_all", -- Estimated total working population 16+ who traveled to work by car
"car_alone", -- Estimated total working population 16+ who traveled to work by car alone
"car_carpoo", -- Estimated total working population 16+ who traveled to work by car with others
"public_trn", -- Estimated total working population 16+ who took public transportation to work
"pt_bus", -- Estimated total working population 16+ who rode a bus to work
"pt_other", -- Estimated total working population 16+ who used any other form of public transportation (other than a bus) to work
"bike", -- Estimated total working population 16+ who rode a bike to work
"walk", -- Estimated total working population 16+ who walked to work
"tr_other", -- Estimated total working population 16+ who used any other form of transportation not mentioned previously
"wrk_home", -- Estimated total working population 16+ who worked from home
"w_16pl_nh", -- Estimated total working population 16+ who commuted to work
"t_less_10", -- Estimated total working population 16+ whose commute time was less than 10 minutes
"t_10_19", -- Estimated total working population 16+ whose commute time was from 10 to 19 minutes
"t_20_29", -- Estimated total working population 16+ whose commute time was from 20 to 29 minutes
"t_30_39", -- Estimated total working population 16+ whose commute time was from 30 to 39 minutes
"t_40_59", -- Estimated total working population 16+ whose commute time was from 40 to 59 minutes
"t_60_pl", -- Estimated total working population 16+ whose commute time was 60 minutes or higher
"pop_3pl", -- Estimated total population 3 years old or higher. (Universe for school enrollment)
"enrolled", -- Estimated total population 3+ who are enrolled in school
"preschool", -- Estimated total population 3+ enrolled in preschool
"kndrgrtn", -- Estimated total population enrolled in kindergarten
"gr_1_4", -- Estimated total population enrolled in grades 1-4
"gr_5_8", -- Estimated total population enrolled in grades 5-8
"gr_9_12", -- Estimated total population enrolled in grades 9-12
"undergrad", -- Estimated total population enrolled in undergraduate studies
"grad_prof", -- Estimated total population enrolled in graduate studies
"n_enrolled", -- Population 3 yrs and older not enrolled in school
"blt_2000_p", -- Estimated number of housing units built since 2000
"b1990_1999", -- Estimated number of housing units built from 1990 to 1999
"b1980_1989", -- Estimated number of housing units built from 1980 to 1989
"b1970_1979", -- Estimated number of housing units built from 1970 to 1979
"b1960_1969", -- Estimated number of housing units built from 1960 to 1969
"b1950_1959", -- Estimated number of housing units built from 1950 to 1959
"b1940_1949", -- Estimated number of housing units built from 1940 to 1949
"b1939_e", -- Estimated number of housing units built in 1939 or earlier
"own_occ_hu", -- Estimated number of owner occupied housing units
"v_l_50k", -- Estimated number of owner occupied housing units where the home value is less than $50,000
"v50k_100k", -- Estimated number of owner occupied housing units where the home value is from $50,000 to $99,999
"v100k_150k", -- Estimated number of owner occupied housing units where the home value is from $100,000 to $149,999
"v150k_200k", -- Estimated number of owner occupied housing units where the home value is from $150,000 to $199,999
"v200k_250k", -- Estimated number of owner occupied housing units where the home value is from $200,000 to $249,999
"v250k_300k", -- Estimated number of owner occupied housing units where the home value is from $250,000 to $299,999
"v300k_400k", -- Estimated number of owner occupied housing units where the home value is from $300,000 to $399,999
"v400k_500k", -- Estimated number of owner occupied housing units where the home value is from $400,000 to $499,999
"v500k_750k", -- Estimated number of owner occupied housing units where the home value is from $500,000 to $749,999
"v750k_1m", -- Estimated number of owner occupied housing units where the home value is from $750,000 to $999,999
"v_1m_plus", -- Estimated number of owner occupied housing units where the home value is $1,000,000 or higher
"rnt_occ_hu", -- Estimated number of renter occupied housing units
"rntl400", -- Estimated number of renter occupied housing units where the gross rent is less than $400
"r400t599", -- Estimated number of renter occupied housing units where the gross rent is from $400 to $599
"r600t799", -- Estimated number of renter occupied housing units where the gross rent is from $600 to $799
"r800t999", -- Estimated number of renter occupied housing units where the gross rent is from $800 to $999
"r1000t1249", -- Estimated number of renter occupied housing units where the gross rent is from $1,000 to $1,249
"r1250t1499", -- Estimated number of renter occupied housing units where the gross rent is from $1,250 to $1,499
"r1500t1999", -- Estimated number of renter occupied housing units where the gross rent is from $1,500 to $1,999
"r2000pl", -- Estimated number of renter occupied housing units where the gross rent is greater than $2000
"rnocshr", -- Estimated number of renter occupied housing units where the householder pays no cash rent
"age_0_9", -- Estimate for the total population from 0 to 9 years of age
"age_10_19", -- Estimate for the total population from 10 to 19 years of age
"age_20_29", -- Estimate for the total population from 20 to 29 years of age
"age_30_39", -- Estimate for the total population from 30 to 39 years of age
"age_40_49", -- Estimate for the total population from 40 to 49 years of age
"age_50_59", -- Estimate for the total population from 50 to 59 years of age
"age_60_69", -- Estimate for the total population from 60 to 69 years of age
"age_70_79", -- Estimate for the total population from 70 to 79 years of age
"age_80_pl", -- Estimate for the total population 80 years of age or older
"ps_uni", -- Estimated total population for whom poverty status is determined
"ps_below", -- Estimated population for whom poverty status is below the poverty level
"adj_ps_uni", -- The adjusted number of people for whom poverty status is determined
"adj_ps_pov", -- The adjusted number of people living below the poverty line
"avghhsize", -- Estimated average household size of occupied housing units
"mlogrecno",
"mpop0711",
"mhispanic",
"mwhite_nh",
"mblack_nh",
"mntvam_nh",
"masian_nh",
"mhawpi_nh",
"mother_nh",
"mtwoplus_n",
"mmale",
"mfemale",
"mageless10",
"mageless18",
"mage18_24",
"mage25_34",
"mage35_44",
"mage45_64",
"mage65plus",
"mmed_age",
"mhousehold",
"mfamilyhh",
"mnonfamhh",
"mhhldralon",
"mhhldr_nal",
"mhousing_u",
"mocc_hu",
"mvac_hu",
"mowned",
"mrented",
"mpop25plus",
"mnohsdipl",
"mhsgrad_sc",
"mbachl_hgh",
"mmed_hh_in",
"mmed_fam_i",
"mper_cap_i",
"mmed_yr_bl",
"mmed_c_ren",
"mmed_g_ren",
"mmed_hm_va",
"mcitz_birt",
"mcitz_nat",
"mnot_citz",
"mborn_in_c",
"mbrn_oth_s",
"mntv_b_o_u",
"mforeign_b",
"mpop_1p",
"msame_hous",
"msame_cnty",
"msame_stat",
"mdiff_stat",
"mfrm_abroa",
"mwrkrs_16p",
"mcar_all",
"mcar_alone",
"mcar_carpo",
"mpublic_tr",
"mpt_bus",
"mpt_other",
"mbike",
"mwalk",
"mtr_other",
"mwrk_home",
"mw_16pl_nh",
"mt_less_10",
"mt_10_19",
"mt_20_29",
"mt_30_39",
"mt_40_59",
"mt_60_pl",
"mpop_3pl",
"menrolled",
"mpreschool",
"mkndrgrtn",
"mgr_1_4",
"mgr_5_8",
"mgr_9_12",
"mundergrad",
"mgrad_prof",
"mn_enrolle",
"mblt_2000_",
"mb1990_199",
"mb1980_198",
"mb1970_197",
"mb1960_196",
"mb1950_195",
"mb1940_194",
"mb1939_e",
"mown_occ_h",
"mv_l_50k",
"mv50k_100k",
"mv100k_150",
"mv150k_200",
"mv200k_250",
"mv250k_300",
"mv300k_400",
"mv400k_500",
"mv500k_750",
"mv750k_1m",
"mv_1m_plus",
"mrnt_occ_h",
"mrntl400",
"mr400t599",
"mr600t799",
"mr800t999",
"mr1000t124",
"mr1250t149",
"mr1500t199",
"mr2000pl",
"mrnocshr",
"mage_0_9",
"mage_10_19",
"mage_20_29",
"mage_30_39",
"mage_40_49",
"mage_50_59",
"mage_60_69",
"mage_70_79",
"mage_80_pl",
"mps_uni",
"mps_below",
"madj_ps_un",
"madj_ps_po",
"mavghhsize"
FROM
"colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest"."census_core_based_statistical_area_in_colorado"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest
This will download all the objects for the latest
tag of colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, colorado-gov/census-core-based-statistical-area-in-colorado-a85s-zxvh
is just another Postgres schema.