Query the Data Delivery Network
Query the DDNThe easiest way to query any data on Splitgraph is via the "Data Delivery Network" (DDN). The DDN is a single endpoint that speaks the PostgreSQL wire protocol. Any Splitgraph user can connect to it at data.splitgraph.com:5432
and query any version of over 40,000 datasets that are hosted or proxied by Splitgraph.
For example, you can query the campaign_finance_summary
table in this repository, by referencing it like:
"wa-gov/campaign-finance-summary-3h9x-7bvm:latest"."campaign_finance_summary"
or in a full query, like:
SELECT
":id", -- Socrata column ID
"pledges_amount", -- The current total pledges reported by the campaign as of the time that the most recent C4 campaign summary report was filed.
"carryforward_amount", -- This filed indicates the amount of money that was reported as the starting balance on the first C4 filed for this filer_id and election year. For continuing committees, this is the balance carried forward for the previous year. For all others, this is money that was remaining from a previous campaign. For example, remaining funds from a candidate's 2012 campaign may be applied as the starting balance of the candidate's 2014 campaign. Any value reported here is not new funds, it was reported as a gain on a previous campaign or election year. Nothing in this definition supersedes applicable WAC/RCW.
"contributions_amount", -- The sum total of all contributions reported. Refer to the "Contributions to Candidates and Political Committees" data set for line-item details.
"treasurer_state", -- The state of the treasurer as reported on the C1pc registration.
"ballot_title", -- The title of the ballot proposal that the committee is supporting or opposing. For a given year, a committee may choose to support multiple ballot proposals. When there is more than one title, the ballot title will be shown as "Various". If there are multiple ballot_proposals but the titles are all the same, the title will be shown.
"ballot_proposal_count", -- The number of ballot proposals that the committee is supporting or opposing in the election year. This field does not apply to candidate committees.
"election_status", -- This field represent the status of a candidate jurisdiction relationship. The candidate could be an Incumbent running again for the same office. The candidate could be a challenger running against an incumbent. Or, the seat that is up for election could be an open seat meaning that there is no incumbent running for that seat.
"primary_election_status", -- The most recent status for a candidate in the primary election. Upon registration, this field is empty. After the primary election, the status of the campaign is assigned as Lost in primary; Judge certified before primary; Not in primary; Won in primary; Unopposed in primary.
"party", -- The political party as declared by the candidate on their C1 registration form. Contains only parties recognized by Washington State law.
"political_committee_type", -- This column designates the persuasion of the PAC; for example, Business, union, PAC, etc.
"committee_category", -- This column contains a category for a political committee (PAC) for example: Continuing Political Committee, Party Committee, committees Affiliated with a party, Single election year committee, School levy committee, Local ballot initiative, Caucus political committee, Statewide Initiative committee.
"candidate_committee_phone", -- The committee phone number. In the case of a candidate, this is the phone number of the candidate, not the candidate's committee.
"committee_zip", -- The zip code of the committee. For candidates, this is the address provided for the candidate committee, not necessarily the address of the candidate.
"committee_state", -- The state of the committee. For candidates, this is the address provided for the candidate committee, not necessarily the address of the candidate.
"committee_city", -- The city of the committee. For candidates, this is the address provided for the candidate committee, not necessarily the address of the candidate.
"filer_name", -- The candidate name as reported on the form C1 candidate registration or the political committee name as reported on the C1pc registration. The name will be consistent across all records for the same filer id and election year but may differ across years due to candidates or committees changing their name.
"candidate_committee_status", -- The status of a campaign at a point in time: Candidate declared – indicates that a candidate formally declared their candidacy during declaration week in May. Candidate registered - indicates that the candidate submitted their candidate registration to the PDC. Candidate withdrew – indicates that a candidate formally declared and then formally withdrew their declaration in May. Candidate discontinued campaign – indicates that the candidate registered their campaign, may or may not have raised money, did not formally withdraw, and stopped campaigning. Candidate deceased – indicates that the candidate died after registration.
"filing_type", -- This field indicated the type of campaign finance reports being filed. "Electronic" indicates that all reports are filed electronically and all of the fields containing financial data are correctly calculated from the filed reports. "Paper" indicates that all campaign finance reports have been filed on paper and all of the fields containing financial data will be empty. "Mixed" indicates that one or more reports are filed on paper and all of the fields containing financial data are potentially incorrect. For both paper and mixed, please consult the individual reports in our "Imaged documents and reports data set to determine the campaign finance details. An empty value indicates that no campaign finance (C3 or C4) reports have been filed)
"discontinued", -- The date the candidate campaign discontinued their campaign.
"legislative_district", -- Legislative District
"fund_id", -- Fund ID
"committee_id", -- The unique identifier of a committee. For a continuing committee, this id will be the same for all the years that the committee is registered. Single year committees and candidate committees will have a unique id for each year even though the candidate or committee organization might be the same across years. Surplus accounts will have a single committee id across all years.
"treasurer_address", -- The address of the treasurer as reported on the C1pc registration.
"on_primary_election_ballot", -- Indicates that the candidate is appearing on the primary election ballot. This datum is only available after election status has been determined and the field will be null/empty before that point. Campaigns prior to 2024 may have incomplete or missing data for this field.
"general_election_status", -- For candidates, this field represents the status at a point in time. The field starts out as empty; after the general election, the status of the campaign is assigned as Lost in general; Judge certified before general; Won in general; Unopposed in general.
"pac_type", -- Pac_type can be bonafide, candidate, caucus, pac, or surplus
"reporting_option", -- This field represents the level of reporting required by the filer. The FULL reporting option requires all contribution and expenditure reports. The MINI reporting option does not require any reports after the C1 registration report is filed. Please consult the PDC web site for more information regarding reporting options.
"ballot_proposal_detail", -- Full details about all the ballot proposals being supported or opposed by the committee in JSON format, with the intent that this information can be consumed by computer systems.
"election_date", -- This is the calendar month/day/year of the election where a candidate is elected to office or the date of the election for a ballot initiative committee.
"office", -- The office sought by the candidate.
"active_candidate", -- Indicates that a candidate campaign is considered "active" in the election. Candidates who have withdrawn or discontinued their campaigns are not considered active. Also candidates who registered their campaign with the PDC but did not formally declare their candidacy with elections officials during or after the declarations period are not considered active. Candidates who did not advance from the primary election or did not win the general election are still considered active. Campaigns prior to 2024 may have incomplete or missing data for this field.
"continuing", -- Whether the PAC is continuing or not. Only set for PACs. Null for candidate, surplus, bonafide and caucus.
"other_pac", -- An X in this column designates the committee as a Continuing Political Committee.
"for_or_against", -- Designates whether the committee is supporting or opposing one or more ballot proposals. If the committee's stance is to both support and oppose various ballot proposals, the value of this field will be "both".
"bonafide_type", -- The type of bonafide party committee, for example State Party, County Party, Leg Dist Party, party Associated committee.
"id", -- PDC internal identifier that corresponds to a candidate or Political committee record in an election year.
"exempt_nonexempt", -- This column designate whether a party committee is exempt from contribution limits or is non-exempt which means it is subject to limits when giving contributions.
"bonafide_committee", -- An X in this column means this is a bona fide party committee
"jurisdiction_type", -- The type of jurisdiction this a candidate office represents: Statewide, Local, Judicial, etc. In the case of a committee supporting or opposing a ballot measure, it is the jurisdiction type of the ballot measure jurisdiction.
"jurisdiction_reporting_code", -- Jurisdictions have different reporting requirements based on the jurisdiction size and local laws adopted by certain jurisdictions. "N" indicates that there are no mandatory reporting requirements. "F" indicates that candidates are only required to file a financial affairs statement. "C" indicates that candidates are required to file both a financial affairs statement and a candidate registration. Candidates in an "N" or "F" jurisdiction may still have reporting requirements depending on the amount their campaign raises or spends.
"candidacy_id", -- Candidacy ID
"on_general_election_ballot", -- Indicates that the candidate is appearing on the general election ballot. This datum is only available after election status has been determined and the field will be null/empty before that point. Campaigns prior to 2024 may have incomplete or missing data for this field.
"party_code", -- Unique code that represents a candidates party. i.e. R represents Republican.
"filer_type", -- This column designates if this is a candidate committee (CA) or a political committee (CO).
"ballot_committee", -- An X in this column designates the committee is supporting or opposing one or more ballot measures.
"jurisdiction_reporting_requirement", -- Text description of the codes in jurisdiction_reporting_code
"url", -- A link to a PDF version of the candidate C1 registration report as it was filed to the PDC.
"treasurer_phone", -- The phone number of the treasurer as reported on the C1pc registration.
"expenditures_amount", -- The sum total of all expenditures reported. Refer to the "Expenditures by Candidates and Political Committees" data set for line-item details.
"loans_amount", -- The current outstanding loan balance of the campaign as of the time that the most recent C4 campaign summary report was filed.
"independent_expenditures_for_amount", -- The sum total of all independent (third party) expenditures made in support of the candidate. Independent expenditures are reported by the third party making the expenditure and the amount reported here only includes amounts where the third party accurately reported the name of the candidate. Refer to the "Independent Expenditures" data set for line-item details.
"person_id", -- The unique ID assigned to a public office holder or candidate. This id is consistent across years and, offices or candidacies and is the preferred id for identifying a natural person.
"treasurer_name", -- Treasurer Name
"treasurer_zip", -- The zip code of the treasurer as reported on the C1pc registration.
"treasurer_city", -- The city of the treasurer as reported on the C1pc registration.
"ballot_number", -- The ballot number or numbers assigned to a ballot proposal. Local ballot proposals in different jurisdictions may have overlapping ballot numbers. This field is a comma separated list of unique ballot numbers being supported or opposed by the committee.
"position", -- The position associated with an office. This field typically applies to jurisdictions that have multiple positions or seats. This field does not apply to political committees.
"jurisdiction_voters", -- The number of registered voters in the jurisdiction. This number is based on the voter count of the prior year's election.
"jurisdiction_county", -- The county associated with the jurisdiction of a candidate or the jurisdiction of a ballot measure in the case of a political committee supporting or opposing a ballot measure. Multi-county jurisdictions are reported as the primary county. This field will be empty when a candidate or ballot measure jurisdiction is statewide.
"jurisdiction_code", -- The unique identifier of the jurisdiction. It is associated with the office of a candidate or the jurisdiction of a ballot measure in the case of a political committee supporting or opposing a ballot measure. This field will be empty when a candidate or ballot measure jurisdiction is statewide.
"jurisdiction", -- The political jurisdiction associated with the office of a candidate or the jurisdiction of a ballot measure in the case of a political committee supporting or opposing a ballot measure. This field will be empty when a candidate or ballot measure jurisdiction is statewide.
"office_code", -- The numeric code that defines the office. For example the office code 31 defines the office of Mayor. Combined with the jurisdiction_code it defines a particular office such as Mayor of Olympia.
"candidate_email", -- The candidate’s personal e-mail. This email may be the same as the candidate’s committee email.
"committee_email", -- The email address of the committee. For candidates, this is the email address provided for the candidate committee, not necessarily the email address of the candidate.
"committee_county", -- The county of the committee. For candidates, this is the address provided for the candidate committee, not necessarily the address of the candidate.
"committee_address", -- The street address of the committee. For candidates, this is the address provided for the candidate committee, not necessarily the address of the candidate.
"committee_acronym", -- If a political committee is known by an abbreviation it is placed in this column. This acronym is optional. An example would be the WA Education Association, whose acronym is WEA.
"election_year", -- The year of the election for a candidate or the calendar year that a political committee is registered for.
"receipt_date", -- The postmark date of the relevant candidate C1 or political committee C1pc registration. A political committee is not required to submit a new or amended C1pc every year. It is possible that a C1pc may be several years old if there are no updates to report.
"withdrew", -- The date of withdrawl when a candidate withdraws their candidacy.
"declared", -- The date the candidate campaign declared their candidacy with the Secretary of State. This field only applies to candidate campaigns.
"registered", -- The date the campaign first submitted a registration. Candidates may not have a date registered if they have declared their candidacy with the Secretary of State or submitted a financial affairs statement indicating their candidacy and have not submitted their registration.
"filer_id", -- The unique id assigned to a candidate or political committee. The filer id is consistent across election years with the exception that an individual running for a second office in the same election year will receive a second filer id. There is no correlation between the two. For a candidate and single-election-year committee such as a ballot committee, the combination of filer_id and election_year uniquely identifies a campaign.
"updated_at", -- The last time that the committee information was updated. This includes information such as the contact information or an amendment to the registration as changes in financial information that result from submitting or amending a report.
"independent_expenditures_against_amount", -- The sum total of all independent (third party) expenditures made in opposition of the candidate. Independent expenditures are reported by the third party making the expenditure and the amount reported here only includes amounts where the third party accurately reported the name of the candidate. Refer to the "Independent Expenditures" data set for line-item details.
"debts_amount" -- The current outstanding debt of the campaign as of the time that the most recent C4 campaign summary report was filed.
FROM
"wa-gov/campaign-finance-summary-3h9x-7bvm:latest"."campaign_finance_summary"
LIMIT 100;
Connecting to the DDN is easy. All you need is an existing SQL client that can connect to Postgres. As long as you have a SQL client ready, you'll be able to query wa-gov/campaign-finance-summary-3h9x-7bvm
with SQL in under 60 seconds.
Query Your Local Engine
bash -c "$(curl -sL https://github.com/splitgraph/splitgraph/releases/latest/download/install.sh)"
Read the installation docs.
Splitgraph Cloud is built around Splitgraph Core (GitHub), which includes a local Splitgraph Engine packaged as a Docker image. Splitgraph Cloud is basically a scaled-up version of that local Engine. When you query the Data Delivery Network or the REST API, we mount the relevant datasets in an Engine on our servers and execute your query on it.
It's possible to run this engine locally. You'll need a Mac, Windows or Linux system to install sgr
, and a Docker installation to run the engine. You don't need to know how to actually use Docker; sgr
can manage the image, container and volume for you.
There are a few ways to ingest data into the local engine.
For external repositories, the Splitgraph Engine can "mount" upstream data sources by using sgr mount
. This feature is built around Postgres Foreign Data Wrappers (FDW). You can write custom "mount handlers" for any upstream data source. For an example, we blogged about making a custom mount handler for HackerNews stories.
For hosted datasets (like this repository), where the author has pushed Splitgraph Images to the repository, you can "clone" and/or "checkout" the data using sgr clone
and sgr checkout
.
Cloning Data
Because wa-gov/campaign-finance-summary-3h9x-7bvm:latest
is a Splitgraph Image, you can clone the data from Spltgraph Cloud to your local engine, where you can query it like any other Postgres database, using any of your existing tools.
First, install Splitgraph if you haven't already.
Clone the metadata with sgr clone
This will be quick, and does not download the actual data.
sgr clone wa-gov/campaign-finance-summary-3h9x-7bvm
Checkout the data
Once you've cloned the data, you need to "checkout" the tag that you want. For example, to checkout the latest
tag:
sgr checkout wa-gov/campaign-finance-summary-3h9x-7bvm:latest
This will download all the objects for the latest
tag of wa-gov/campaign-finance-summary-3h9x-7bvm
and load them into the Splitgraph Engine. Depending on your connection speed and the size of the data, you will need to wait for the checkout to complete. Once it's complete, you will be able to query the data like you would any other Postgres database.
Alternatively, use "layered checkout" to avoid downloading all the data
The data in wa-gov/campaign-finance-summary-3h9x-7bvm:latest
is 0 bytes. If this is too big to download all at once, or perhaps you only need to query a subset of it, you can use a layered checkout.:
sgr checkout --layered wa-gov/campaign-finance-summary-3h9x-7bvm:latest
This will not download all the data, but it will create a schema comprised of foreign tables, that you can query as you would any other data. Splitgraph will lazily download the required objects as you query the data. In some cases, this might be faster or more efficient than a regular checkout.
Read the layered querying documentation to learn about when and why you might want to use layered queries.
Query the data with your existing tools
Once you've loaded the data into your local Splitgraph Engine, you can query it with any of your existing tools. As far as they're concerned, wa-gov/campaign-finance-summary-3h9x-7bvm
is just another Postgres schema.