splitgraph.core.indexing package

Submodules 

splitgraph.core.indexing.bloom module 

Bloom filtering on fragments for equality queries.

splitgraph.core.indexing.bloom.describe(index_tuple: Tuple[int, str]) → str

Returns a pretty-printed summary of the bloom filter

Parameters: index_tuple – Tuple of (k, base64-encoded fingerprint) returned by generate_bloom_index
Returns: String

splitgraph.core.indexing.bloom.filter_bloom_index(engine: PsycopgEngine, object_ids: List[str], quals: Any) → List[str]

Runs a bloom filter on given qualifiers using the given objects’ previously-generated fingerprints.

Parameters

engine – Object engine
object_ids – Object IDs
quals – List of qualifiers

Returns

List of object IDs that might match the qualifiers in quals (including IDs that don’t have a bloom index).

splitgraph.core.indexing.bloom.generate_bloom_index(engine: PsycopgEngine, object_id: str, changeset: Optional[Dict[Tuple[str, ...], Tuple[bool, Dict[str, Any], Dict[str, Any]]]], column: str, probability: Optional[float] = None, size: Optional[int] = None) → Tuple[int, str]

Generates a bloom filter signature for a given column and a given fragment. Bloom filters can answer queries asking whether an item is definitely not in a given set or possibly can be.

The tradeoff is between the probability of a false positive (item said to be in the set when it actually isn’t) and the size of the filter.

Bloom filters also have an extra parameter, k, or the number of bits in the signature that a certain item flips. This parameter has an optimal value for a given number of distinct items or a probability and so isn’t explicitly passed by the user.

Parameters

engine – Object engine the fragment is cached in.
object_id – Fragment ID
changeset – Optional, if specified, the old column values are included in the index.
column – Column name to generate the index on.
probability – Probability of a false positive. Either this or the size of the filter must be specified, but not both.
size – Size of the filter, in bytes.

Returns

Dictionary to be inserted into the index.

splitgraph.core.indexing.range module 

splitgraph.core.indexing.range.extract_min_max_pks(engine: PsycopgEngine, fragments: List[str], table_pks: List[str], table_pk_types: List[str]) → Any

Extract minimum/maximum PK values for given fragments.

Parameters

engine – Engine the objects live on
fragments – IDs of objects
table_pks – List of columns forming the table primary key
table_pk_types – List of types for table PK columns

Returns

List of min/max primary key for every object.

splitgraph.core.indexing.range.filter_range_index(metadata_engine: PsycopgEngine, object_ids: List[str], quals: Any, column_types: Dict[str, str]) → List[str]

splitgraph.core.indexing.range.generate_range_index(object_engine: PsycopgEngine, object_id: str, table_schema: TableSchema, changeset: Optional[Dict[Tuple[str, ...], Tuple[bool, Dict[str, Any], Dict[str, Any]]]], columns: Optional[List[str]] = None) → Dict[str, Tuple[splitgraph.core.indexing.range.T, splitgraph.core.indexing.range.T]]

Calculate the minimum/maximum values of every column in the object (including deleted values).

Parameters

object_engine – Engine the object is located on
object_id – ID of the object.
table_schema – Schema of the table
changeset – Changeset (old values will be included in the index)
columns – Columns to run the index on (default all)

Returns

Dictionary of {column: [min, max]}

splitgraph.core.indexing.range.quals_to_sql(quals: Optional[Sequence[Sequence[Tuple[str, str, Any]]]], column_types: Dict[str, str]) → Tuple[psycopg2.sql.Composable, Tuple]: Convert a list of qualifiers in CNF to a fragment of a Postgres query :param quals: Qualifiers in CNF :param column_types: Dictionary of column names and their types :return: SQL Composable object and a tuple of arguments to be mogrified into it.

Splitgraph has been acquired by EDB! Read the blog post.

splitgraph.core.indexing package

Submodules 

splitgraph.core.indexing.bloom module 

splitgraph.core.indexing.range module 

Module contents 

Product

Support

Company

Splitgraph

Splitgraph has been acquired by EDB! Read the blog post.

splitgraph.core.indexing package

Submodules

splitgraph.core.indexing.bloom module

splitgraph.core.indexing.range module

Module contents

Product

Support

Company

Community

Splitgraph

Submodules 

splitgraph.core.indexing.bloom module 

splitgraph.core.indexing.range module 

Module contents 