At the time of writing, the Splitgraph registry runs on a single 30 EUR/month Scaleway instance. Despite that, we index and let you query over 40,000 public datasets with simple SQL.
This series of articles discusses our infrastructure and how we stay lean while still providing a lot of value to data scientists and engineers.
To run Splitgraph in production, we use Docker Compose. We found it to be a surprisingly effective solution. This article is a collection of tips and tricks for using Docker Compose that we discovered. We'll talk about:
We are eventually interested in moving to Kubernetes or Nomad (we're even writing a script that automatically converts Compose files to Nomad jobs!). But, these solutions felt like overkill for our use case.
Besides, we relied on Docker Compose a lot when building Splitgraph. Currently, we run integration tests in CI using it. We wanted to match the dev, CI and production workflow as closely as possible.
Docker Compose allows you to combine multiple Compose configurations into one at runtime. This feature is called "overrides".
Here's a sample Compose definition for a "prod" service:
version: "3.4"
services:
registry-db:
image: ${DOCKER_REPO}/registry-db:${DOCKER_TAG}
expose:
- 5431
env_file:
- ${CONFIG_HOME}/config/registry-db.env
restart: on-failure
user: "postgres:pgsockets"
volumes:
- registry_dbdata:/var/lib/postgresql/data/pgdata
- pg_pgb_sockets:/var/pgsockets
- ${CONFIG_HOME}/config/registry-db.sgadmin.password:/registry-db.sgadmin.password:cached
...
Our production configuration normally includes:
The development Compose file is a mixin on top of the production configuration. For this service, it's:
version: "3.4"
services:
registry-db:
ports:
- "0.0.0.0:5430:5431"
volumes:
- ./src/py/splitgraph/splitgraph:/splitgraph/splitgraph
- ./src/luamods/main:/home/postgres/luamods
- ./src/py/sgr_admin/sgr_admin/embedded/dbs:/home/postgres/pymods
- ./components/registry/registry-db/etc/postgresql:/etc/postgresql
environment:
- SG_LOGLEVEL=DEBUG
This can include:
This lets us avoid repetition in our configuration. We also use this idea to split up different aspects of the stack into separate Compose files. We call these "service groups".
We have runscripts that wrap Compose overrides to give the developer a single script. They can use it to run Compose commands on a service group. The script can inject the "dev" configuration if needed:
#!/usr/bin/env bash -e
if [[ "$1" == "--dev" ]] ; then
DEV=1
shift
fi
export CONFIG_HOME=${CONFIG_HOME-"$PWD"}
exec docker-compose -f docker-compose.prod.yml \\
$( [[ -n $DEV ]] && echo "-f docker-compose.dev.yml" ) \\
-f docker-compose.dbs.build.yml \\
-f docker-compose.dbs.prod.yml \\
$( [[ -n $DEV ]] && echo "-f docker-compose.dbs.dev.yml" ) $@
We deploy Splitgraph using GitLab CI. The deploy process runs in several stages.
First, we use our Makefile (we wrote more about how we use Makefiles in a previous blog post) to build a "deploy bundle". This bundle is self-contained and includes only the files needed to run Splitgraph in production:
We store the actual tag (commit hash) that we're deploying in a .env
file in the bundle that every script sources.
On the server, we have a "deploy home" directory that contains service configuration files. It also contains a config.json
file that includes:
The CI script copies the bundle into this directory and extracts it. We keep copies of previous deploy bundles too, each one in a timestamped directory. This means we can roll back to an earlier version in case of a failed deploy by running that bundle's installation script.
The installation script first pulls all required containers from the GitLab CI registry. At this point, it uses the deploy's commit hash as the tag.
The script then retags these containers with a :production
tag. This is to avoid Compose reloading all services on every deploy. If an image's tag (not only the SHA hash) has changed, Compose will treat the service's configuration as changed. Keeping the tag fixed prevents that.
The script then runs the "configurator". It's a container with a Python application that uses Jinja. It uses the bundle's config templates and the config.json
file to regenerate configuration for all services that we run.
We wrote more about the configurator in a previous blog post.
We bind mount configuration into containers. We prefer this to baking it into the image. This lets us make emergency hotfixes without having to go through the CI pipeline.
Next, the installation script runs schema migrations. We check all our migrations into version control and run integration tests on them in CI.
These migrations form part of the "admin" container. We also use this container to run other Splitgraph instance management tasks.
Finally, we implement a lot of Splitgraph functionality as PostgreSQL functions in languages like PL/Python
or PL/Lua
. We package those up as PostgreSQL extensions which we also install at migration time.
We have several service groups that we deploy one after another. We use ./service-runscript.sh up -d
to upgrade most of them. This means that Docker Compose only recreates containers for services with changed specifications. You can read more about this behavior in the Compose documentation.
This is not a zero downtime deploy, but we try to minimize excess container restarts by pinning third-party container hashes. For example, instead of haproxy:latest
, we use haproxy@sha256:e6f9faf0c2a0cf2d2d5a53307351fa896d...
.
In some cases, we run many copies of the same service group for high availability. This is the case for our REST API. There, the script waits until the replica is healthy before upgrading the next one:
#!/bin/bash -e
# Wait for a container to become healthy.
CONTAINER=$1
timeout=120
while true; do
status=$(docker inspect -f '{{.State.Health.Status}}' "$CONTAINER")
if [ "$status" != "starting" ]; then
break
fi
if (( timeout < 0 )); then
echo "Timed out waiting for $CONTAINER to become healthy"
exit 1
fi
sleep 2
timeout=$(( timeout - 2))
echo "Waiting for $CONTAINER to become healthy, $timeout s.."
done
if [ "$status" == "unhealthy" ]; then
echo "$CONTAINER is unhealthy!"
exit 1
fi
echo "$CONTAINER is healthy."
Most containers don't get recreated or restarted on deploy, which helps us minimize downtime. But what do we do if we need to reload a service's configuration without restarting it?
The answer is simple. Lots of services like HAProxy, Prometheus, PgBouncer or NGINX support configuration reloads via a SIGHUP
signal. You can send a UNIX signal to a container with Docker:
$ docker kill --signal SIGHUP container_name
All we need to do to apply new configuration is send SIGHUP
to relevant containers after a deploy. If the configuration has errors, these services will silently switch back to the old one.
There's a caveat here with bind mounts. You have to make sure you're bind mounting the directory with the configuration file rather than the actual file. This is because of how Docker bind mounts interact with Linux inodes. More on Docker GitHub.
As the final step, the deploy runs docker-compose ps
to make sure all containers that have healthchecks pass them. If there are unhealthy containers, the deploy script raises an alert in our team chat.
We also use Prometheus and Grafana to scrape various container statistics, plot them and alert on them. In particular, we have alerting rules on:
We check our Prometheus configs into version control. We do the same with Grafana dashboards. We load those from configuration files using Grafana's provisioning feature.
In this article, we discussed how we use Docker Compose to run Splitgraph in production, as well as how to use it for automated deploys.
This is the final article in the series that talks about our build, test and deploy infrastructure. If you're interested in learning more about Splitgraph, you can check our frequently asked questions section, follow our quick start guide or visit our website.