splitgraph.ingestion.socrata package
Submodules
splitgraph.ingestion.socrata.fdw module
Module imported by Multicorn on the Splitgraph engine server: a foreign data wrapper that communicates to Socrata datasets using sodapy.
- class splitgraph.ingestion.socrata.fdw.SocrataForeignDataWrapper(fdw_options, fdw_columns)
Bases:
object
- can_sort(sortkeys)
- Parameters
sortkeys – List of SortKey
- Returns
List of SortKey the FDW can sort on
- execute(quals, columns, sortkeys=None)
Main Multicorn entry point.
- explain(quals, columns, sortkeys=None, verbose=False)
- get_rel_size(quals, columns)
Method called from the planner to estimate the resulting relation size for a scan. It will help the planner in deciding between different types of plans, according to their costs. Args:
- quals (list): A list of Qual instances describing the filters
applied to this scan.
columns (list): The list of columns that must be returned.
- Returns:
A tuple of the form (expected_number_of_rows, avg_row_width (in bytes))
- property table_meta
- splitgraph.ingestion.socrata.fdw.to_json(row, columns, column_map)
splitgraph.ingestion.socrata.mount module
Splitgraph mount handler for Socrata datasets
- class splitgraph.ingestion.socrata.mount.SocrataDataSource(engine: PostgresEngine, credentials: Credentials, params: Params, tables: Optional[Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]]] = None)
Bases:
splitgraph.hooks.data_source.fdw.ForeignDataWrapperDataSource
- credentials_schema: Dict[str, Any] = {'properties': {'app_token': {'description': 'Socrata app token', 'type': 'string'}}, 'type': 'object'}
- classmethod from_commandline(engine, commandline_kwargs) splitgraph.ingestion.socrata.mount.SocrataDataSource
Instantiate an FDW data source from commandline arguments.
- classmethod get_description() str
- get_fdw_name()
- classmethod get_name() str
- get_raw_url(tables: Optional[Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]]] = None, expiry: int = 3600) Dict[str, List[Tuple[str, str]]]
Get a list of public URLs for each table in this data source, e.g. to export the data as CSV. These may be temporary (e.g. pre-signed S3 URLs) but should be accessible without authentication. :param tables: A TableInfo object overriding the table params of the source :param expiry: The URL should be valid for at least this many seconds :return: Dict of table_name -> list of (mimetype, raw URL)
- get_server_options()
- params_schema: Dict[str, Any] = {'properties': {'batch_size': {'default': 1000, 'description': 'Amount of rows to fetch from Socrata per request (limit parameter)', 'maximum': 50000, 'minimum': 1, 'type': 'integer'}, 'domain': {'description': 'Socrata domain, for example, data.albanyny.gov', 'type': 'string'}}, 'required': ['domain'], 'type': 'object'}
- table_params_schema: Dict[str, Any] = {'properties': {'socrata_id': {'description': 'Socrata dataset ID, e.g. xzkq-xp2w', 'type': 'string'}}, 'required': ['socrata_id'], 'type': 'object'}
- splitgraph.ingestion.socrata.mount.generate_socrata_mount_queries(sought_ids, datasets, mountpoint, server_id, tables: Union[List[str], Dict[str, Tuple[List[splitgraph.core.types.TableColumn], TableParams]]])
splitgraph.ingestion.socrata.querying module
- splitgraph.ingestion.socrata.querying.cols_to_socrata(cols, column_map: Optional[Dict[str, str]] = None)
- splitgraph.ingestion.socrata.querying.estimate_socrata_rows_width(columns, metadata, column_map=None)
Estimate number of rows required for a query and each row’s width from the table metadata.
- splitgraph.ingestion.socrata.querying.quals_to_socrata(quals, column_map: Optional[Dict[str, str]] = None)
Convert a list of Multicorn quals to a SoQL query
- splitgraph.ingestion.socrata.querying.socrata_to_sg_schema(metadata: Dict[str, Any]) Tuple[List[splitgraph.core.types.TableColumn], Dict[str, str]]
- splitgraph.ingestion.socrata.querying.sortkeys_to_socrata(sortkeys, column_map: Optional[Dict[str, str]] = None)